Pro Bono Data Analysis This Summer

Posted by on Apr 4, 2017 in Analysis, News, Projects | 0 comments

Disadvantaged Student Performance

For Summer of 2017, The San Diego Regional Data Library will be running a data internship program, working on pro-bono data analysis projects for nonprofits, governments and journalists in San Diego County. If your organization has questions that can be answered with data, you can have an undergraduate data analyst work on your project, with professional guidance, for up to 12 weeks.

Project sponsors must be able to describe their needs as a set of questions that can be answered with available data, and must be able to meet with the interns at their site for 1 hour per week. Projects must have a social goal or benefit.

We’re currently running two projects, and expect to be running three or four more this summer.

If your organization needs data analysis this summer, please contact the Data Library Director, Eric Busboom at (619) 363-2607 or

Read More

Age Friendly Communities Project

Posted by on Dec 12, 2016 in Projects | 0 comments

At tomorrow night’s meeting, we’ll be kicking off two new data projects. The first is the Health Food Access project, previously announced, and the second is the Age Friendly Communities project, for which we’ve just posted the project page. In this project we will be collecting data to analyze the capacity and affordability of San Diego’s assisted living industry, considering the anticipated need for these services over the next 30 years. Hope you can join us. 

Read More

Wrangling Data For Social Projects

Posted by on Dec 6, 2016 in Projects | 0 comments

Next week we’ll be kicking off two new data projects, and a big part of these projects will be finding data, documenting it, and preparing it in a consistent way for analysis, a process known as data wrangling. I’ve been developing software for wrangling social data for a few years, and have collected many of the best ideas into a new metadata system called Metatab. Metatab is a system for storing structured metadata in a CSV file, often alongside data, making it easier to create  and publish metadata.

In the next two data projects, we will using the Metatab Google Spreadsheet Add-On to document data we locate for the two projects. Once a metatab specification is created for a dataset, it can be uploaded to CKAN, our data repository software directly from the Google spreadsheet system. And I’m currently working on other tools for finding and manipulating data.

When we are done with the main data wrangling, there will be collections of datasets in our main data repository  related to food access and assisted living, and then we can start on data analysis, most likely using Pandas and Tableau, but we may also tackled using a few AWS tools like AWS Athena and AWS Quicksight.

Register for the Meeting


Read More

Wanted: Data Library Project Manager

Posted by on Sep 15, 2016 in Misc | 0 comments

The Library has been quiet lately, mostly because the Director doesn’t have time to properly manage projects, and because he’s not very good at it. So, we’d like to find a volunteer to manage projects.  This is a volunteer role that would involve:

  • Talking to nonprofits and journalists about data needs
  • Recruiting other volunteers for data projects
  • Setting up meetings and finding rooms
  • Participating in data projects

We’ve got two projects that need some attention: the Health Food Access Data Library and a crime prediction contest.

If you are interested in data and have good organizational skills, please apply by sending email to Eric Busboom,

Read More

Healthy Food Access Data Library Project

Posted by on Aug 16, 2016 in Projects | 0 comments

Collect and analyze data about the food system in San Diego county.

The San Diego Food System Alliance’s Healthy Food Access Working Group is developing an indicator library to analyze food access issues, and we need your help to locate datasets, wrangle them into useable shape, and create visualizations.

The work is similar to the topics of our March 2015 Data Contest, with additional work of building a reusable data library to perform additional analysis.

This project needs volunteers with a range of skills, including:

  • Administration and logistics: Call potential data providers, locate datasets, and arrange meetings and events.
  • Data wranglers: People skilled with either Excel or Python to manipulate datasets.
  • Data analysts: Data analysis who know R or Python/Pandas.

We will be starting with a list of potential datasets, from which we will construct Ambry Data Bundles. We can load the bundles into a data library. Then we can do visualizations and analysis, such as this map from a project at Palomar College.

How To Participate

To participate in this project, join the practical data program, then join the project mailing list by selecting the “Food Access” list under the “List Memberships” section of your profile page.

Team meetings will be posted to our site,  the Practical Data Program site, and our Practical Data Program mailing lists. We’ll have our first meetings to get started in late August.

Read More

Crime and Community Data Challenge

Posted by on Apr 3, 2016 in New Data | 0 comments

To announce the arrival of a new set of crime data, our next meetup will be a mini data contest, with a $100 prize for the best student analysis. In this meeting, we will present the new Crime Incident dataset and talk about how to link it to other social datasets. After the presentation, we’ll challenge you to do you own analysis, with a $100 prize for the best analysis from a student, undergraduate or lower.

Then, for the next meeting, we’ll invite the best analysts to present their findings and techniques.

Read More