Open Data Portal - Part 2

I had been thinking about Datasette all wrong. I’m used to working with static websites where I download code, modify it at will and upload it somewhere to get served as a website. Instead I need to think of Datasette as an application I can install.

$ pip install datasette

Then I install sqlite-utils, the library made by the Datasette creator to make it easier to manipulate sqlite databases.

$ pip install sqlite-utils

I got the routes.txt file from Metro’s bus GTFS data and loaded it into a sqlite database.

$ sqlite-utils insert routes.db routes routes.txt --csv

The documentation is a little bit vague on this but I believe the command format is:

$ sqlite-utils insert [database-name.db] [table-name] [source-file-name.ext] --[delimiter]

OF COURSE after I go through this I find this other library from the Datasette creator… csvs-to-sqlite. It’s even easier to turn a csv into a sqlite database.

$ pip install csvs-to-sqlite

The README doesn’t mention this but it will add a new table to an existing database.

$ csvs-to-sqlite trips.txt routes.db

$ sqlite-utils tables routes.db --counts

[{"table": "routes", "count": 134},

{"table": "trips", "count": 27912}]

You can even pass it multiple files and it will bundle them into a single database:

$ csvs-to-sqlite routes.txt trips.txt gtfs-bus.db

Run Datasette using this new gtfs-bus.db and I see a top level heading of gtfs-bus (the database) with routes and trips listed (the tables, which are just individual files). SQL querying worked.

Publishing

I used the auto-updating standalone installation because the ‘snap’ install didn’t work.

$ curl https://cli-assets.heroku.com/install.sh | sh

Logging in to my Heroku account was super easy. I crossed my fingers and ran the publish command… and it FAILED because the name ‘datasette’ was already taken. What a dumb error, of course I need to give it a unique name.

$ datasette publish heroku gtfs-bus.db -n datasette-gtfs-bus

Voilà: Datasette demo running GTFS bus data!

Open Data Portal - Part 1

Looking into options for standing up an open data portal, hosting datasets, etc. Things I’m looking at:

  • CKAN - It’s in python. I haven’t really looked into what development looks like.
  • JKAN - The simplicity is very appealing but it hasn’t been actively developed in a while. Even a Bootstrap 4 update is stuck and not merged into the main codebase.
  • Qri.io - They responded to my questions on Discord. It seems like it’s more for individual datasets, like a single table of data. Cool concept though and has a GitHub-esque online component.
  • Datasette - This looks like something I could start with.

I’m realizing I’ve got a lot of anxiety around running these python projects as production websites. It’s uncharted territory for me because I usually only play with them locally. Datasette has a Docker image available, which I’m pretty sure is a good thing. I’m still working on wrapping my head around how to use Docker and the best way to do that is to jump right in. Pandemic has really raised my anxiety in all this because I don’t have a community of peers I can easily tap into as a resource. I’m working on leaning on my friends but it doesn’t come naturally. Anyways…

The Datasette Installation Guide provides instructions for running via Docker.

Get the Image

Since I’ve got Docker already installed on my Windows machine, I’m able to open PowerShell and just run the > docker pull ... command for the image to show up in the Docker Desktop application.

Run a Container

The installation guide gives a command with options to spin up a container. The question now is how do I get Docker Desktop to run those options. I don’t see any place to enter the commands. Looking at the Docker documentation, it looks like I can probably just run the > docker run ... command in PowerShell but I need to change the directory syntax for Windows. I still couldn’t get it to work in a way where the page would show up in the browser so I just ran the command using WSL and it worked like that.

Development

I don’t think I’m supposed to do my development on the Docker container itself…? It looks like the container ran the application and as the host I’m able to pass it whatever data source I want. However if I want to change the code for the application itself (to change the CSS for example) I’ll need to probably clone the repo and then re-dockerize it. Do I need to dockerize it if I’m only deploying it for myself? Can I deploy the code to Heroku or something? Maybe it makes sense to dockerize if I’m deploying to AWS. I’ll need to figure out how to create the database file.

Saying No

It’s almost March of 2019 already, and the last time I wrote a post was in July 2018. So much has been happening, I wish I had documented it better.

2019 is looking to be even crazier and I expect to get a lot of practice saying ‘No’ this year. I thought it’d be a good idea to document that… as well as the things I’m saying ‘Yes’ to.

Events / New Commitments in 2019

  • So Cal ACLU Board
  • School of Data conference
  • Arts Datathon Planning Committee - Crafting Track
  • Playing the erhu as part of the Chinese Kun opera performance for Bitter Party
  • SCaLE Open Government Track committee
  • Run for NAC again
  • LAC Women In Technology Employee Association
  • Grace Hopper Celebration - Career Track proposal evaluations

No

  • Organizing HFLA’s Open Data Day.
  • Panel on government data at NICAR - conflicts with SCaLE and I need to table in the afternoon.
  • Attending Tech Ladies Brunch - Too much socializing, I need a rest day this weekend.
  • Attending Global Diversity CFP Day - Too much effort, I need to chill this weekend.

Ongoing Commitments

  • Hack for LA leadership
  • MaptimeLA leadership
  • LAC Asian American Employee Association
  • LAC WordPress User Group
  • Data + Donuts LA
  • Open California Collaboration

MaptimeLA Channel Islands Camping Trip

Last weekend was MaptimeLA’s second camping trip. We took a boat from Ventura to Santa Cruz Island, one of the five islands that make up Channel Islands National Park. I didn’t do much research heading in so I didn’t know what to expect, except that there were specific rules on what could be brought on the boat and to the island.

Turns out it was much more relaxed and built out than we expected! You can rent snorkeling and kayaking gear on the island. People do day trips. The camp sites are a short trek away on a flat dirt road - individual sites are .3 miles in and the group sites are another .3 miles further. The campsites have potable water spigots, well maintained pit toilets that have toilet paper, toilet seat covers, AND hand sanitizer! They’re kept shady by the many large eucalyptus trees that are found just at the campgrounds. The one thing to watch out for is the wildlife - ravens and foxes will steal food if you don’t keep it secured in the fox boxes. We saw it happen several times and if the rangers catch you being careless you’ll get fined. Several trails start in the area, but they all go up and there is no shade cover so be careful hiking on hot days.

Our boat ride on Friday was very choppy because of a coastal storm somewhere. We passed by a huge traveling pod of dolphins! We did an evening hike up to Potato Harbor, which was wonderfully windy as we walked along the bluffs. Over the next two days, the wind died down severely and it was much hotter. The others attempted to hike, but I chose to stay in the shade at the campground - which was conveniently still breezy. I was intent on enjoying myself by having a relaxing weekend enjoying the nature and remoteness by knitting, crocheting, and reading. Mission accomplished for me. For food, I stuck to my diet by snacking on the variety of low carb cheese breads, pesto, nuts, salami, jerky, and bacon.

It was such a great relaxing weekend and it was so much fun to watch the baby foxes play. I’d love to go back!

July Life Update

June and July are both turning out to be a non-stop meet and talk to people fest. Today I had lunch with Kerry Silverstrom, Chief Deputy of LA County Beaches and Harbors. She is one awesome lady and she keeps it so real. Over the weekend, I saw Omar and Leigh, which was great… it’s been a while since we last all got together. We talked about project structures and being inspired by the Center for Urban Pedagogy.

I meant to write more, but work has been busy, renovating the house has been busy, and I need to prep for our July 4th BBQ, which is in 2 days.