My upcoming “practice your data sci skills” events
I’ve got a couple of different speaking engagements coming up–a workshop here in Chicago and a preconference in Vegas!
First, in about two weeks, on April 24, my research partner Rebecca Raszewski and I will be giving a workshop at the University of Chicago. It’s being held the day before the Zar Symposium whose subject this year is Data: Collecting, Using, Managing. I can’t find any further information on the website and I know they were going to set a small cap so it’s possible that the workshop is full. But so you can see what I’m doing, here’s the official announcement:
Exploring Our Own Data: Practical Application of Data Management : Skills to Library Data Sets
[April 24, 2014, 1:00-4:00; The John Crerar Library; The University of Chicago]
As need for experience with managing data sets continues to grow, librarians need hands-on opportunities to work through the data life cycle and practice their own data management skills. Library data sets present a rich trove for exploration and expansion of this skill set. Join Abigail Goben and Rebecca Raszewski for a hands-on workshop using reference desk statistics to improve your data skills. Discuss challenges and pitfalls of working through the data life cycle, including licensing rights and the hazards of multiple hands capturing data. Explore creating forms and doing some preliminary statistical analysis with Google Tools. Utilize Open Refine, an open source tool for standardizing and transforming messy data, with an instructor-provided data set.
And then in June I’ll be giving a preconference for LITA at the ALA Conference in Vegas. With Sarah Sheehan and Nathan Putnam, we’ll be doing a full day on practicing data science skills, working with an actual library data set.
Here’s the official announcement:
“Managing Data: Tools for Plans and Data Scrubbing” with Abigail Goben, University of Illinois, Chicago; Sarah Sheehan, George Mason University; Nathan B. Putnam; University of Maryland.
As data continues to come to the fore, new tools are becoming available for librarians to assist faculty and use with their own data. This preconference will focus on the DMPTool and OpenRefine. The DMPTool will be presented to demonstrate customization features, review data management plans, best and worst practices and writing a data plan for a data set a library may collect. OpenRefine will be demonstrated with sample data to show potential use with library data sets and more of the data lifecycle process; metadata will also be covered.
So what does that really mean? It means we’re going to spend the day getting your hands dirty, doing a lot of hands on practice and we’re doing to do it with actual library data. If you’ve got a small data set that you want to bring and spend the day working with, you’ll be encouraged to do that, but we’ll also have one there for you to work on it too. You’ll work through the process of a data lifecycle, answer the questions not just as a librarian, but as a data scientist. I expect a rousing debate about sharing non-sensitive library data and I am looking forward to seeing how people approach data standardization differently–as well as documenting that standardization. Sarah and I are neither data librarians full time, so we’re coming from the liaison perspective. Nathan, though, is the Head of the Metadata Services, and he’s going to take all of us through improving our metadata skills (one of the advantages of putting this together is I get to learn from him too!).
Registration is definitely still open for the pre-conference. Please feel free to reach out to me if you have questions! If you have an idea of a data set that you’d like to bring, it’s worth starting to think about that now as it’s ALREADY APRIL (yeeesh) and June is going to be here shortly after I finish my next cup of coffee.