Category Archives: Data

The Known Facts About Donald Trump

Donald and his dad Fred, to whom he owes a lot, as Newsweek details.
Donald and his dad Fred, to whom he owes a lot.

Three authoritative pieces about Donald Trump have emerged in recent days. These are based on solid straight-forward reporting by Newsweek, The Atlantic and Washington Post, and are followed by Keith Olbermann’s oxygen depleting recitation of factual reasons Donald Trump shouldn’t be president.

These stories are all over the place today, but I’m pinning them here just in case anyone lands here who needs to be reminded what their vote for Trump is actually a vote for. Continue reading The Known Facts About Donald Trump

LINK: A Modern Library

Screenshot 2015-02-01 09.33.29 The Internet Archive is one of the most spellbinding places online. To visit is to get lost in the world wide web’s past, or to revisit a concert one was at 30 years ago, or more recently (for me) to play Lemmings again for the first time since the early 90s. You might call that a waste of time, or you might call it a reminder that we march on, leaving our past behind. Which isn’t always to the good.

Playing games is the cookie that lures you into IA’s new operating system in a browser emulator, but Andy Baio explains in this article that the main dish (pardon the metaphor) is the ability to access and utilize all that data that have been lost as these old operating systems became outmoded. If you can get the files off the floppy disks.

LINK: Bad Taxi Math

tumblr_inline_nhmjs9Fcm61szvr4hI Quant NY is a Tumbler dedicated to New York City data stories.

It generally relies on the city’s Open Data sets, but this story started with an article in BusinessWeek about taxi tipping. The reporter actually had to request the data, and then showed some stuff that I Quant NY had trouble with.

The result is a wonderful (and I use that term precisely, though I know it is also funny) detective story investigated through math, and nailed with actual evidence: receipts! It won’t spoil things if reveal that a sizable if possibly unintentional fraud is involved.

Wonderful, really.

The Oatmeal on Net Neutrality

The basic idea is that massive megaliths like Comcast and Time Warner and, wait, are they merging? And every other cable company that has a monopoly because of community franchising, have some community responsibility.

That responsibility is called Net Neutrality.

Net Neutrality means different things to every communication company trying to stick citizens with higher bills for their cable and internet service.

To all of them it means lower revenue.

But to the people who pay absurd cable and internet bills each month, net neutrality means that no matter what any company offers over the internet pipes, the price is the same.

Competing services, like Red Box, Amazon and Netflix, might have different business strategies, might have different owners, but each should pay the same amount to transfer their data through the internet to your house.

That’s net neutrality.

The same should be true if you’re selling Marxism, Leninism or Maoism. The price is for bandwidth, not ideology.

Comcast and Time Warner and your cable company would like you to think that this is unfair.  They’re wrong. They’ve tried to sell you on paying extra for faster pipes, and better video speeds. They may have made money doing this.

But the basic principle of the internet is equality, and that breaks down quickly when those who own the pipes are able to discriminate between different data streams passing through.

Which is why this cartoon from the Oatmeal resonates:

http://theoatmeal.com/blog/net_neutrality

LINK: It’s Election Day and Someone is Writing About Nate Silver!

We all love predictions, and Nate Silver has proven himself adept at making them, so it’s understandable attention turns his way on Election Day.

bayesWhat is also clear is that we have a hard time understanding the nature of a prediction, which is why Silver not only says what he thinks is going to happen but also offers the odds that he’ll be wrong. To determine these odds Silver turns to Bayes’ Theorum and the more modern Bayesians, who have developed a way to measure uncertainty in a prediction based on the work of the English statistician and minister, Thomas Bayes (pictured).

This truth in packaging is what makes even Silver’s miscalls informational.

The mathematician Jordan Ellenburg takes a look at how many of Silver’s predictions will be wrong today in Slate, if Silver’s self-claimed odds of being wrong are correct. It won’t spoil the fun of reading the piece for me to tell you that Silver should be wrong about 2.5 senate races.

A View of the Spreadsheet From the Early Days

I started using a spreadsheet to calculate my fantasy league standings, maybe in 1988. The program was called VP-Planner, and the front end of it was a clone of Lotus 1-2-3, which cost $500 at the time. VP-Planner cost $35, which for a regular guy was plenty.

But the real power and appeal of VP-Planner was it’s ability to access databases, and import specific data into the spreadsheet without a lot of tedious cutting and pasting.

Each week I would download the baseball stats from a sports gambler service in Las Vegas, massage the format to get it set, and click RECALC (F9) on my keyboard. My IBM XT would whirr and purr, and I would go to lunch.

Sometimes, the calculations were complete when I came back, sometimes not.

spreadsheetMy friend Steven Levy reprinted a story he wrote about spreadsheets for Harpers magazine this week at his new online mag, called Backchannel. It’s a fascinating look from a distant perch at the effect of the democratization of data and our ability to model systems quickly and fairly easily.

In the middle of the piece he writes: “The computer spreadsheet, like the transcontinental railroad, is more than a means to an end. The spreadsheet embodies, embraces, that end, and ultimately serves to reinforce it. As Marshall McLuhan observed, “We shape our tools and thereafter our tools shape us.” The spreadsheet is a tool, and it is also a world view — reality by the numbers. If the perceptions of those who play a large part in shaping our world are shaped by spreadsheets, it is important that all of us understand what this tool can and cannot do.”

I suspect data journalists like Nate Silver and Ezra Klein could not agree more.

Ezra Klein is Pessimistic about global warming.

There is an article at Vox by Ezra Klein that is well worth reading.

He has seven reasons why America will fail at global warming. If you don’t want to read this, he also has a brief video “conversation” with Te-Nehisi Coates at the top of the page in which he summarizes the points he makes in more detail in the article.

Klein’s argument is pretty straightforward. We’re too late getting started fixing the climate (if it is fixable), and our political processes do a poor job of taking on a problem that requires long-term sacrifices for less-than-immediate gains.

He does a good job, too, of showing how the Republican position on putting a price on carbon has changed radically since John McCain ran for president (and lost). McCain supported it. He also writes about why we can’t expect to engineer our way out of the problem scientifically.

All of this is pretty depressing, but he also does a good job of covering his butt at the end, saying he is pessimistic but not fatalistic. He hopes a solution will emerge.

As I watched him talk with Coates in the video I realized that while Klein was limning the depths of the coming disaster, he was also painting this as an obvious problem for America. But as this map shows, while the US may produce a disproportionate percentage of the world’s carbon emissions, we also have one of the biggest cushions for absorbing climate change. It isn’t really our problem yet, unless we look further into the future.  (click to enlarge)

Screenshot 2014-09-23 09.39.36Klein quotes Matt Yglesias on this: “Very few of us are subsistence farmers. Relatively few of us live in river deltas, flood plains, or small islands. We are rich enough to be able to feasibly undertake massive engineering projects to safeguard our at-risk population centers. And the country is sufficiently large and sparsely populated that people can move around in response to climate shocks.”

So, the question becomes, how do we convince Americans to make significant changes and sacrifices when the short term threat level isn’t nearly as dire as the long term threat?

Marching felt great, we should do more of that, but we need to keep talking broadly about how the system works and why it isn’t really designed to answer this question. Maybe, I worry, we’re not designed to answer questions like this one as a species, but I’m not pessimistic. I’m pretty sure that we will hammer on this problem, as other ones, with increasing urgency. And while we talk about it and argue about it and elect public officials who recognize the problems, we’ll make progress.

Will it be enough, soon enough? I hope so.

LINK: How St. Louis County Profits From Poverty

Screenshot 2014-09-04 09.42.34One of the most important jobs of journalism is to explain the way things work. Or, in the case of St. Louis County and its many municipalities, how they don’t work. Radley Balko explains it all in this long and hugely worthwhile Washington Post story, which does a masterful job of making one feel the aggravating and humiliating context of the Ferguson demonstrations, as well as lining up a surprisingly bureaucratic and appalling series of villains.

In this case, a history of white flight, a checkerboard of self-serving municipalities that rely on legal fines to pay for themselves, and the slow development of a social culture that results in outstanding bench warrants that stifle the efforts of citizens to get a job or job training, or most egregiously, to run their own businesses.

Click the image to see the map of numbered municipalities larger. Each given its own stretch of highway from which to profit.

LINK: A Database Tracking Incidents of Deadly Police Use of Force

“The nation’s leading law enforcement agency [FBI] collects vast amounts of information on crime nationwide, but missing from this clearinghouse are statistics on where, how often, and under what circumstances police use deadly force. In fact, no one anywhere comprehensively tracks the most significant act police can do in the line of duty: take a life,” according to the Las Vegas Review-Journal in its series Deadly Force (Nov. 28, 2011).

D. Bryan Burghart is an editor of the Reno News and Review. Confronted with this information gap, one has come to believe is intentionally maintained by the FBI and police forces across the country, he has set up a crowd-sourced database project to collect basic information about every incident of the use of deadly police force.

Progress relies on FOIA requests and research provided by volunteers. We can all help with this important project.