Featured

Print is dying. Seriously.

Every journalist knows print is dying, but we still sometimes struggle to grasp the magnitude of the problem. While working on a broader project looking at changes in media consumption, I ended up putting together a chart showing per capita newspaper earnings in America – and the drop that begins in 2005 is simply stunning.

Per capita spending on daily newspapers (measured by newspaper earnings) over time.

Per capita spending on daily newspapers (measured by newspaper earnings) over time.

My storyboard has some more info, but quite frankly, that chart alone is worth the price of admission.

Featured

College Students on Ashley Madison

“Grace” was in a religion class when she opened up her e-mail and found out that someone had pranked her by making her an account on Ashley Madison, the infidelity-themed web site. She’d had no idea until that day in April, 2013 when she looked through her e-mails and saw that three men on the site had sent her “winks.”

ashley madison account hidden confirmWithin the day, she’d requested that Ashley Madison delete her account. They agreed, and told her it would be “hidden” from the service. It wasn’t until Grace was contacted for this story that she even realized her data was still out there, and had become part of the Ashley Madison data breach and leak.

The 7 Year Itch everyone

Age distribution at the time of joining Ashley Madison, based on data from the Ashley Madison leaked databases. Men in orange, women in blue.

While the majority of Ashley Madison users were older, the site boasted millions of members, and drew that membership from all age ranges – including college students. At first, this seems strange, given that few college students are married. Therefore, they wouldn’t seem like the natural demographic to join a site whose motto is, “Life is short, have an affair.”

However, what college students lack in married malaise, they more than make up for in curiosity. Joshua Ullom seems to typify this group, saying that he and some friends made an account for “drunken laughs.” They didn’t talk to anyone on the site, and Ullom says that the absurdity of the entire thing was hard to get past.

“We were just talking about how crazy the whole concept of the website was and just wanted to look around.”

Other stories are similar. Most report being curious or bored, or stunned that such a brazen infidelity site could exist. Several hinted that they might have really tried the site, but refused to take part in the story.

Then there’s Bryan Pauley.

Pauley was a student at Kent State university when he signed up, and he signed up hoping to meet women. He said that he was intrigued by the site’s “guaranteed” results. Having tried several free dating sites at the time and found that they seemed filled by people who were “catfishing,” Pauley decided that he would try something else, and see if a paid dating site was any better.

Ashley Madison and women at signup

Distribution curve for women. Very similar to men, including the spike of membership from people who were 36 (or *claimed* they were 36) when they signed up.

Pauley’s hopes were disappointed.

He quickly found himself interacting with strange profiles that seemed to just “string you along.”

He said that these profiles all followed a similar pattern: A woman would message him, but refuse to ever send new pictures, or talk on the phone, or meet, or even chat on AOL Instant Messenger, which he said was very popular at the time.

Pauley believes that one of the profiles he exchanged messages with was an actual, honest, breathing woman – unlike the others, which he said he believes might have been fake profiles, meant to keep him interested and on the site. This seemingly authentic woman reported that she’d gone on a few dates with men she met through Ashley Madison, but the men were always “too aggressive.” Eventually, she stopped talking to Pauley, and after three months or so, Pauley closed his account and gave up on Ashley Madison.

The 7 Year Itch men

Age distribution for men when they joined Ashley Madison. Same jump at age 36, and if you compare the *scale* of the male distribution curve, you’ll see that there’s roughly a five-fold difference in the number of men than women on the site. This may explain some of the “aggresssion” the woman Pauley corresponded with described.

“I just canceled it, and didn’t try online dating for several years,” Pauley said.

Others said that they never went as far as Pauley, with many saying that they quit as soon as they realized they’d have to pay to properly join.

Anthony Buck, a student at the University of Alabama at the time, said he signed up because his friends were hanging out and couldn’t believe that there was actually a site specifically for having an affair.

“My friends and I were all gathered around, and I guess I was the guinea pig,” Buck said. “I signed up once, and never got on again. I’m pretty sure I wasn’t even able to view anything ’cause I wasn’t about to pay for the membership.”

Grace’s single day of membership was more than enough for her. She says she’s offended that her data is still out there, that this prank profile has come back from her past.

“I said, please, permanently remove anything associated with this name and this e-mail address, and they didn’t,” Grace said. “That’s not okay. I don’t like that. I know they suck.”

Click on the map below to get an interactive view of colleges around the country with e-mail addresses in the leak!

Map shows total number of matching e-mail addresses for colleges and universities across the country. Note that there is no way to know how many of those e-mail addresses were used deliberately by their owners.

Map shows total number of matching e-mail addresses for colleges and universities across the country. Note that there is no way to know how many of those e-mail addresses were used deliberately by their owners.

So many ISPs, so little choice

When I recently got the chance to play with some data on Internet Service Providers courtesy of Decision Data, one of the biggest questions on my mind was this: Just how many companies compete for our business?

competing-coverage

Turns out, numerically speaking, quite a few. Decision Data has stats on a whopping 2,345 providers nationally. This means that plenty of zip codes have a lot of competition – on paper. It’s when you really look at that competition that you realize it’s not all that it’s cracked up to be.

Right off the bat, there’s wireless. Wireless internet access is fine (I happen to use it myself) but it isn’t exactly perfect. If you need or want to download a lot of data in a month, you’ll hit your cap pretty quick on most wireless plans (or pay a small fortune.)

So I’m not sure we can count all of those cell phone providers as true ISPs. Sure, a lot of us get a lot of data delivered via cell phone tower — but most of us need an Internet that won’t run out of capacity if we decide to binge watch Buffy.
You also have commercial/business providers like Level 3 Communications. I’ll  bet they’re fast and allow plenty of data per month – and I’ll bet they wouldn’t be interested in a small-fry customer like me, even if I *do* refer to myself as a business. (It’s a freelance thing.) In fact, the number of business providers came as something of a surprise to me — apparently, businesses have plenty of choices. We consumers, it would seem, not so much.

Side note: In spite of how awful our Internet speeds are supposed to be when compared to places like South Korea, my map of the highest speeds available by zip shows that (theoretically, anyhow) much of America has access to gigabit-speed Internet. In the map below, red is fast, blue is slow.

the-fasts-and-the-fast-nots

My zip code has 13 different providers, in theory. Out of all of them, I think only three would be real “choices” for me, and last time I checked only one of them actually reaches my apartment – which means the other choices aren’t actual choices at all. Not unless I go run a couple thousand feet of Cat 5 cable to the nearest spot an alternate provider *does* reach.

70806-isps

So I’ve basically got the one choice, if I want to move away from my grandfathered (and truly unlimited) wireless data plan. Like most Americans, when it comes to Internet access, I’m overwhelmed with options but underwhelmed by most of them.

But at least it ain’t dial-up.

A Borderlands 2 Damage-Per-Minute Calculator

So, I’ve been playing Borderlands 2 with my son lately, and comparing weapons has been bugging me. There are a bunch of factors, and it’s a hassle juggling them in your head as you compare every single one.

But then I realized: Tableau could juggle *for* me.

Borderlands 2 Damage Per Minute Calculator

So here it is, my Borderlands 2 DPM calculator tool:

Borderlands 2 Damage Per Minute Tool

The tool isn’t terribly fancy, but it *does* work. The cool part is that building the tool only took a few minutes. This is where Tableau can shine – in quick-and-dirty visualizations that you build on the fly to solve a single problem.

Well, that, and in those giant, enterprise-level stories where you get a Big Reveal at the end!

Tableau introduction materials (originally tested on MC 3005)

Howdy, everyone! Here are all the links you may or may not need for the workshop on Friday:

First, download Tableau Public, if you haven’t already:

https://www.tableau.com/public/download

Second, download the county rankings data, from here:

http://www.countyhealthrankings.org/rankings/data

We want the top file – the 2015 county rankings in XLS format (Excel format).

http://www.countyhealthrankings.org/sites/default/files/2015CountyHealthRankingsNationalData.xls

Also, I have two Tableau visualizations that I would like to use for some demonstrations and further explanations of both how I use Tableau, and how I conduct data investigations. During the workshop, I’ll show how to download these workbooks straight into Tableau, and then make them your own:

https://public.tableau.com/views/CountryTrendsData2015/CountryTrendsData?:embed=y&:showTabs=y&:display_count=yes

https://public.tableau.com/views/EPADMRPollutants_0/EPADMRPollutants?:embed=y&:showTabs=y&:display_count=yes

I’ll be sure and explain plenty more during the workshop including some of the other tools I use – like OpenRefine, and Workspace Macro Pro. One word of warning: I work on a Windows platform, and many of the programs I use aren’t available for Macintosh. I poked around online and found some promising Macintosh macro programs, but I’m not aware of any substitutes for OpenRefine. It does appear that Tabula can work with the Macintosh (yes, Tabula sounds like Tableau, but it’s entirely different.) This is a good thing, as Tabula can help you get to the data locked up in .PDFs, which I find enormously helpful.

After class, I’ll keep this blog post up for anyone who needs to look back and reference anything. Given that I seldom update my blog, much to my own chagrin, folks should have no trouble locating this particular entry!

No neutrality for Swiss Americans

Swiss_AmericansSoon I’ll wrap this series up – but first, we need to take a look at Swiss-Americans, and their reverence for the Mason-Dixon line.

The Swiss aren’t particularly common in America, but they’re notably sparse in the South. Apparently, whatever brought those with Swiss ancestry to the United States didn’t involve a love of sweet tea and kudzu.

Link to full Swiss tableau: https://public.tableausoftware.com/views/supercensus/SwissAmericans?:embed=y&:display_count=no

Hispanic America is a Mexican America

Hispanic_AmericaToday we look past the general term Hispanic to see where America’s Hispanics truly come from. Calling someone “Hispanic” is like calling them “European.” It speaks somewhat to their ancestral heritage, but encompasses a *very* broad swath of very different folks.

Yet as we see when we map the United States by most populous Hispanic sub-type for each zip code, we find the Hispanic population of America is, overwhelmingly, Mexican.

To arrive at this took a very tedious formula. I’m sure there’s some better way to do it, but the only way I knew was with an IF, AND, THEN, ELSE statement. Basically, for each sub-type, I made a formula that said, “If there are more of this group than that group, and that group, and that group, and that group (etc.) then this group is the biggest group in the zip code.” If the group wasn’t the largest, the code looked at the next one. And so forth, and so on. Here is the code for just one of the ethnic groups I looked at:

if [Guatemalan]>[- Cuban] and [Guatemalan]>[- Dominican (Dominican Republic)] and [Guatemalan]>[- Mexican] and [Guatemalan]>[- Other Hispanic or Latino: – Spaniard] and [Guatemalan]>[- Other Hispanic or Latino: – Spanish] and [Guatemalan]>[- Other Hispanic or Latino: – Spanish American] and [Guatemalan]>[- Puerto Rican] and [Guatemalan]>[Argentinean] and [Guatemalan]>[Bolivian] and [Guatemalan]>[Chilean] and [Guatemalan]>[Colombian] and [Guatemalan]>[Costa Rican] and [Guatemalan]>[Ecuadorian] and [Guatemalan]>[Honduran] and [Guatemalan]>[Nicaraguan] and [Guatemalan]>[Other Central American] and [Guatemalan]>[Other South American] and [Guatemalan]>[Panamanian] and [Guatemalan]>[Paraguayan] and [Guatemalan]>[Peruvian] and [Guatemalan]>[Salvadoran] and [Guatemalan]>[Uruguayan] and [Guatemalan]>[Venezuelan] then “Guatemalan”

Here’s a link to the Tableau tool: https://public.tableausoftware.com/profile/jared5561#!/vizhome/themexicans/HispanicAmerica