Data in San Francisco:
Meeting supply, spurring demand
City and County of San Francisco Mayor Edwin M. Lee Joy Bonaguro, Chief Data Officer July 31, 2015
Table of Contents 1. Executive Summary 2. Mission, Vision and Approach 3. Looking Back: The Year in Review 4. Looking Forward: Year 2 Goals and Strategies Overview of Approach and Goals Goal 1. Make timely data easily available Goal 2. Improve the usability, quality and consistency of our data Goal 3. Support increased use of data in decision-making Goal 4. Identify and foster innovations in open data and data use Goal 5. Continuously improve, scale, maintain and monitor our work 5. Priority, Resource and Contingency Analysis 6. Conclusion Appendices Appendix A. Acknowledgements Appendix B. Detailed Accomplishments in Year 1 Appendix C. Quarterly Milestones for Year 2 Appendix D. Crosswalk between plan and Open Data Policy
Data in San Francisco: Meeting supply, spurring demand - Return to Top
2 of 40
1. Executive Summary Our Mission and Vision
At DataSF, we are working to transform the way the City works through the use of data. Our mission is to empower use of the City’s Data. Our vision is that the City’s data is understood, documented, and of high quality. The data is published so that it is usable, timely, and accessible, which supports broad and unanticipated uses of City data. City employees have the skills and capacity to collect, manage, and use data effectively and efficiently across its lifecycle.
The Ultimate Impact of Our Work
Through the dissemination and use of City data, we can: ● ● ●
Improve City services for residents and businesses, Generate jobs and economic activity and Increase resident engagement and empowerment.
These in turn support increased quality of life and work for San Francisco residents, employers, and employees.
Our Key Accomplishments in Year 1
Below are some of our key accomplishments in Year 1. Section 3 of this document goes into greater detail for each goal area.
Completed the dataset inventory
Our core charge in Year 1 was completing a dataset inventory to list all of the datasets in each department. This was an immense task and took up a great deal of our effort and time in the last year. Our Department Data Coordinators were key to this task and without them it would not have been possible. Learn more in our blog post on the inventory.
Relaunched our open data portal and created a web home for DataSF
Our web presence needed a total overhaul to ensure that we could better support our users, whether seeking data or working to publish it. In addition to the open data portal, we needed to create enduring resources like our publishing and coordinators portal as well as our resource library and blog. Learn more in our blog post on the redesign.
Standardized publishing methods and metadata requirements
Standardizing the publication of datasets ensures high quality publishing over time. Consistent information about published datasets makes the data easier to use, fostering more and better use of the data. We took into account best practices from around the world and the tailored them to San Francisco to ensure quality publishing. Learn more in our b log post on metadata.
Established a Citywide open data license for published data
The City needed a licensing strategy designed for data. A single license reduces ambiguity for users and ensures that our data can be fully leveraged by individuals and companies alike. We officially adopted the Public Domain Dedication License (PDDL) to meet the particular needs of
Data in San Francisco: Meeting supply, spurring demand - Return to Top
3 of 40
open data. Learn more in our blog post on PDDL.
Launched the Housing Data Hub
The Housing Data Hub, http://housing.datasf.org/, is a single place to learn about affordable housing data programs in San Francisco and the administrative data behind them - visualized and easy to use. This was our first strategic release - the bundling of open data publication with products that put the data to immediate use. Learn more in our blog post on strategic releases.
Launched the Data Academy
Working in partnership with the City Services Auditor, we launched a training program that covers the whole lifecycle of data - from planning, collection, management, analysis to design and publishing. Classes are booked out and demand is insatiable. R ead about Data Academy.
Developed a strategy to improve confidential data sharing
Internal confidential data sharing is hampered by a legal thicket and poorly integrated technical systems. Working in partnership with the City Services Auditor and more than a dozen City departments, we put together a strategy to promote data sharing that is efficient, effective, consistent, secure, and appropriate.
Advocated for and obtained additional resources
Our resource strategy for Year 1 was to 1) seek institutional homes and partners for our work and 2) pursue dedicated resources where appropriate and with good justification. This time last year, we were a team of one. We doubled our team with the role of the Open Data Program Manager last fall. And during the year, we put together business cases to double yet again with new roles to support 1) open data services and 2) support execution of our confidential data sharing project. We will continue to work closely with key partners around the City doing similar work.
Our Roadmap for Year 2: From Foundation to Use
Our Year 1 plan1 was about building a foundation for the future and creating the institutional support to grow use and dissemination of data in San Francisco. In Year 2, we need to build upon that foundation and ensure a ready and predictable supply of data that is addressing data gaps and needs. If last year was about building the house, this year is about moving in and throwing a big house-warming.
Year 2 Goals and Subgoals
For Year 2, we are structuring our work around five core goals and subgoals as needed. Goal Goal 1.
1
Make timely data easily available
Subgoals (where appropriate) 1. Increase number and timeliness of datasets on SF OpenData 2. Enable use of private data, while appropriately protecting it
Read “Open Data in San Francisco: Institutionalizing an Initiative” via google docs.
Data in San Francisco: Meeting supply, spurring demand - Return to Top
4 of 40
3. Streamline internal data access Goal 2.
Improve the usability, quality and consistency of our data
Goal 3.
Support increased use of data in decision-making
Goal 4.
Identify and foster innovations in open data and data use
Goal 5.
Continuously improve, scale, maintain and monitor our work
1. Increase internal capacity 2. Support public capacity 3. Foster and incent a data culture
As we execute on these goals and supporting strategies we look forward to reporting on key accomplishments next year. Below are a handful of accomplishments we plan to achieve this year: ● ● ● ● ● ● ● ● ● ●
Fully deployed data automation as a service to ease data publication Deployed better, friendlier publishing for geographic data Identified methods to crowdsource collective intelligence about published datasets Launched new transparency websites Engaged our broader community around a handful of key issues or datasets Developed “Data Concierge” to streamline internal data access for City employees Established center to facilitate and standardize confidential data sharing Began to systematically tackle data quality Developed Data Academy into a professional development strategy Enriched our data through effective storytelling
We encourage you to visit our website, at datasf.org/about to track our progress over the next year. We will post quarterly reports on our strategic plan, including updates and revisions. You can also view our publishing progress at datasf.org/progress.
Data in San Francisco: Meeting supply, spurring demand - Return to Top
5 of 40
2. Mission, Vision and Approach
Our mission is to empower use of the City’s Data. Our vision is that the City’s data is understood, documented, and of high quality. The data is published so that it is usable, timely, and accessible, which supports broad and unanticipated uses of our data. City employees have the skills and capacity to collect, manage, and use data effectively and efficiently across its lifecycle. Like our Year 1 plan, our Year 2 plan is ambitious. To execute on our plan we will adhere to some core approaches for how we manage our work: 1. Say no to perfection. We don’t have enough time for perfect. Something is better than nothing and you can always improve it as you learn more. 2. Fail early and often. Failing is ok - not learning from a failure is not ok. Small experiments, failed or successful inform our next steps. 3. Plan for the future. Create infrastructure and systems for future growth - but solve immediate problems and pain points along the way 4. Use long division. If a problem seems too big, break it into manageable bits. There’s always a hook or a starting point to move something forward. 5. No ugly, old IT. We leverage existing, modern, and light-weight tools and we want our designs to be beautiful, inviting but also a little fun. 6. Use storytelling and data. We must work to find the people in the data and tell their story. Data without people is just academic. 7. Seek institutional homes. Distribute, share and foster excellence. While we may incubate programs, ideas or projects, we ultimately need to find a full-time home. 8. Learn to infinity and listen with humility. Continuously learn from ourselves and others and build on existing frameworks. “Not invented here” attitudes are strictly prohibited. 9. Start with problems, move to opportunities. We start with people's needs and problems but also use the chance to show them some cool, new stuff for the future. 10. If we don’t start now, we’ll never get there. We don’t want to look back in five years and think “if we had just…”. Every shady street started with a row of saplings.
Data in San Francisco: Meeting supply, spurring demand - Return to Top
6 of 40
3. Looking Back: The Year in Review Summary and Reflections Building the Foundation
In Year 1 our focus was defining the scope of the program, identifying and developing key partnerships, and of course, building out the programmatic infrastructure, including core services, business processes, and roles and responsibilities. The work we completed in the last year provides the foundation upon which we will build our data work for the City. While the foundation is not yet complete, we have made tremendous progress.
It Takes a Village
A huge portion of that progress is due to key partnerships with the Department of Technology (in particular, the DT GIS team for data automation and services) and the Controller’s City Services Auditor (for a variety of projects). These partnerships allowed us to execute on several components of our strategic plan that were not resourced at the start of last year. We expect these partnerships to grow and strengthen over the next year and we are already exploring new partnerships within and outside the City. Much of our open data work this year would not have been possible without our Data Coordinators. Our coordinators were essential in conducting major aspects of our Year 1 plan, including the dataset inventory that lists all datasets held by the City and County. The effort and quality of their contribution cannot be understated. Thank you Data Coordinators! We also received a huge infusion of talent and energy from our interns and graduate students throughout the year. And the partnership that has emerged with our local Code for America Brigade, Code for San Francisco, has been invaluable - not only with projects but as a means of keeping us real. Lastly, we added an incredibly talented person to the core open data team - Jason Lally. His passion, insight and effort as our Open Data Program Manager has been at the heart of almost every key accomplishment this year. Appendix A includes a detailed list of the many thanks we owe from this last year.
Program Highlights
In the sections below, we cover highlights for each goal. Appendix B includes a link to our final milestone report and includes a summary table describing the accomplishments by strategy in greater detail.
Data in San Francisco: Meeting supply, spurring demand - Return to Top
7 of 40
Goal 1. Increase number and timeliness of datasets on DataSF Completed the dataset inventory
Our core charge in Year 1 was completing a dataset inventory to list all of the datasets in each department. This was an immense task and took up a great deal of our effort and time in the last year. The full inventory is published as a dataset on SF OpenData or you can view the visual link as shown in the screenshot (this uses a new feature on the data portal called Data Lens). Our Data Coordinators were a key part of making this successful and it would not have been possible without them. They are the true heroes in this effort. Given that we had found little guidance on how to conduct a comprehensive data inventory, we documented our approach and reflections in a blog post “5 Ways to Scale the Mountain of Data in Your Organization.” Our hope is that other open data programs can learn from our experience. We also made all of our materials available in our R esource Library. As of the end of the fiscal year, 75% or 39 of 52 departments had completed or partially completed the inventory. We will add additional departments on a rolling basis. In addition, we are building a whole series of tools and resources on top of the data inventory - turning it into a platform.
Developed three key methods to prioritize data for publication
Our dataset inventory includes over 700 datasets and counting and none of them come with a magical publish button. So we have to prioritize our data for publication. We developed 3 key methods to do so (and soon a 4th): 1. Department Drip. As part of the inventory, we asked departments to prioritize their data as a function of value and data classification and to inform publishing plans (forthcoming). 2. Endorse a Dataset. The data inventory can be used to elicit both internal and external endorsements to publish data. While we haven’t built this yet, it is coming soon. 3. Strategic (or Thematic) Releases. One of the challenges of open data is that it often involves the release of unrelated data in a haphazard manner. Strategic releases are born out of a belief that simply publishing data is no longer sufficient. Open data programs need to take on the role of adding value to open data versus simply posting it and hoping for its use. One way is to release a body of data plus a product that puts the data to use out of the gate. This can help open data become more relevant to a local Data in San Francisco: Meeting supply, spurring demand - Return to Top
8 of 40
audience that is focused on issues, not just apps, which is what we did with the H ousing Data Hub. We provide more details on our approach in our blog post “H ow to Unstick Your Open Data Publishing” and you can view the prioritization grid for the Department Drip strategy below:
Launched data automation as a service in partnership with Department of Technology One of our key criteria under this goal is the timely and regular publication of data. If we rely on individuals to publish data, we will not be able to scale our program. So we partnered closely with the Department of Technology’s GIS team to develop the business model and supporting processes and technology to offer data automation as a service. Later this year we’ll be publishing our ETL Toolkit that will document our work and serve as both an internal and external reference.
Launched support programs and portals for Data Coordinators and Publishers As part of the dataset inventory, we developed a program to actively engage our Data Coordinators, including creating a Data Coordinator web portal, workshops, webinars and a slew of online resources. And we’ve started the process of better supporting our publishers with the launch of the publishers portal at end of year. We expect our support effort for Data Coordinators to decrease and publishers to increase in the next year.
View the Coordinators Portal and Publishing Portal.
Data in San Francisco: Meeting supply, spurring demand - Return to Top
9 of 40
Goal 2. Improve the usability of DataSF Launched the new SF OpenData
Overhauling the open data platform was a core deliverable in Year 1. The image below shows the before and after:
Our blog post, The New DataSF!, details more about the key usability changes we made.
Launched a new web home for our overall program, DataSF
In addition to the portal overhaul, we needed a new web presence to showcase the rest of our work. This is also when we branded the data portal to SF OpenData and reserved DataSF for our overall program. Even better, the code we used to build the website is freely available for others to repurpose and use as you can read in our blog post “Raising the digital barn”.
Collaborated on new portal features
Lastly, we partnered heavily with our vendor to introduce some new features to the portal. While these are still in the works, we are excited about some of the new tools and features that will help make open data easier to use for everyone - not just technical folks.
Data in San Francisco: Meeting supply, spurring demand - Return to Top
10 of 40
Goal 3. Improve the usability, quality and consistency of our data Created and deployed a metadata standard for SF OpenData
Ensuring that our data is well documented prior to publication is a key part of making it usable. Unfortunately, metadata (data about data) is usually an afterthought. We made it front and center and upped the documentation requirements to ensure that our data is not simply published - it is published with information that can help folks use it. You can read more about the process we followed and how we elicited community and City feedback in developing the standard in our two blog posts “Metadata & Dating - More in Common than you Think…” and “U Heart Metadata”. We also published all of our metadata research and materials in our R esource Library.
Reset and standardized how we publish data on SF OpenData Part way through the year, we realized we needed to dedicate work to resetting the published data on DataSF. Much of the data had been published in inconsistent ways, with varying standards and restrictions. We codified publishing guidelines and incorporated them into the Publishing Portal. The reset work is still underway but a key visible accomplishment was the relaunch, in partnership with the Police Department and the DT Open Data Services team, of police incidents as a single multi-year dataset natively hosted on SF OpenData. Previously, the data had been published as separate shapefiles for each year with only the last 30 days on SF OpenData natively. Native hosting allows you to easily generate maps and other visuals as shown in this map which shows all police incidents since 2003 in a single map.
Deployed a help desk for incorporating and tracking user feedback and questions Understanding what questions users have about our data helps us improve how we publish it. Our user feedback methods were limited to a nomination form provided by our vendor. In lieu of this we created our own Contact Us form and are tracking data questions and requests in a single place using an enterprise ticketing system. By codifying and quantifying this, we can better respond to user needs. In Year 2, we’ll be expanding the number and types of user feedback mechanisms we use.
Data in San Francisco: Meeting supply, spurring demand - Return to Top
11 of 40
Goal 4. Enable use of confidential data, while appropriately protecting it Goal 4 was largely dependent on resources. Fortunately, we were able to partner in the Fall with the City Services Auditor and several key agencies to put together a comprehensive strategy to address this goal. This was a key pivot in our approach to focus on the use of data in the context of coordinated care. The picture captures an activity from one of our planning sessions.
Why Coordinated Care? Social service delivery is in the midst of a migration from program to people centric care. Our most vulnerable individuals touch multiple systems - education, human services, and criminal justice - which have historically operated in silos. The transition to coordinated care will better meet the needs of our clients by tailoring care to meet the needs of each individual, rather than administering programs with a one-size-fits-all approach. A coordinated care approach is best carried out when multiple jurisdictions are able to share data about the individuals they are jointly serving, so that efforts are not duplicated, and the dosage of services is based on the right mix of supports. Unfortunately, most of our rules and laws regarding data sharing were made within distinct verticals, such as health care, early education, education, criminal justice etc. This legal thicket leads to an implementation thicket. Each jurisdiction navigates this thicket afresh, which concentrates risk on individuals and localities interpreting the law. In addition, to the legal work, we need coordinated policies and procedures as well as the right mix of technology and supporting infrastructure. The diagram below shows the focus of our project, which will be a multi-year effort, in the context of coordinated care:
Data in San Francisco: Meeting supply, spurring demand - Return to Top
12 of 40
Goal 5. Support increased use of data in decision-making Launched Data Academy in partnership with the City Services Auditor
Our analyst survey from last year demonstrated an unmet need for more training in data use, collection, and visualization. Fortunately, the team in the City Services Auditor was offering a few classes and we teamed up to expand the number, type and frequency. We also launched a website for the Data Academy. The demand has been incredible and the feedback very positive. Below is a picture from our Basics of Information Design class.
Developed the Stat Starter Kit in partnership with the City Services Auditor
While the Data Academy targets individual skills, we also wanted to support department skills in using data. A variety of departments expressed interest in starting “performance stat” programs. To respond to this, we partnered with the performance team in the City Services Auditor who led the way putting together a series of resources to help departments start “stat” programs. We’ll be launching the Stat Starter Kit early in Q1 of Year 2.
Launched the Housing Data Hub in partnership with a village
While the previous programs support individual and department skills in data, we also wanted to leverage open data to improve public capacity to use and understand City data. While we are at the beginning of this journey, we were excited to launch the H ousing Data Hub (housing.datasf.org) this year. The Housing Data Hub is a single place to learn about policies and programs related to housing affordability as well as the administrative data behind them - contextualized and visualized for easy consumption. This is part of a key strategy we are pursuing, which is to publish our data in a way that is more meaningful and accessible for our local stakeholders who care about local issues, not just applications. Read more on our thinking on what we are calling s trategic or thematic open data releases. We think this is a key part of fostering a data-enabled policy environment. The screenshot below shows one of the “data browser” views on the Housing Data Hub that incorporates just in time learning moments that help explain the data visualized below. In Data in San Francisco: Meeting supply, spurring demand - Return to Top
13 of 40
addition, users can link back to the original data on SF OpenData by clicking on “Get the source dataset”. The Housing Data Hub was another great example of “it takes a village”. We received help from each of the key departments but also volunteers from C ode for San Francisco. You can visit the Hub and read more about all of those who contributed.
Data in San Francisco: Meeting supply, spurring demand - Return to Top
14 of 40
Goal 6. Identify and foster innovations in open data and data use Launched a blog and reclaimed our twitter account
A key part of fostering innovation is engagement and communications. When we started last year, our Twitter account had been abandoned and we had very few ways of engaging and reaching our audiences. While we have so much more work to do here (and many things upcoming), re-establishing our voice was a key first step. You can read our blog at DataSF Speaks and follow us on Twitter @DataSF.
Launched the Resource Library
Another way to foster innovation is to document and share what we are doing so that other open data programs can benefit. We are finding that folks use our online resources and will follow up with additional questions or thoughts. We are also hearing from programs across the country (and occasionally world) that have adapted our documents and resources for local use.
Data in San Francisco: Meeting supply, spurring demand - Return to Top
15 of 40
Adopted a licensing strategy designed to foster open data reuse
One of the key issues in publishing data is ensuring that it can be legally reused. Unfortunately, this topic does not get enough attention. We surveyed the landscape to come up with a licensing strategy that would fit the unique needs of open data. And then we worked closely with our legal team to put it in place. You can read more about what we did in our blog post “Data License Liberation Day“ and our research and related documentation is available via the Resource Library.
Data in San Francisco: Meeting supply, spurring demand - Return to Top
16 of 40
4. Looking Forward: Year 2 Goals and Strategies Overview of Approach and Goals
From Foundation to Meeting Supply/Spurring Demand
If Year 1 was about building the foundation, Year 2 is about buying furniture, painting the walls, hanging photos and throwing a housewarming party. It’s time to open the doors and not just let folks in, but deliver the invite in-person. That’s why the theme for Year 2 is to fill out the supply of data but also ensure that it’s being used by a broader range of people.
Goal Shifts in Year 2
We expected our Year 1 goals to hold steady for the next three years. While this is generally true, we modified our goals to reflect some key insights: ●
●
Two of our goals fit nicely under a broader goal of making timely data easily available. As a result, we consolidated the following two goals under a broader goal of “make timely data easily available”: ○ Increase number and timeliness of datasets on SF OpenData ○ Enable use of private data, while appropriately protecting it Our web presence demanded a huge amount of effort and focus in Year 1 to update it and establish a new, comprehensive presence. While, we will continuously improve our online tools, the goal now fits more appropriately under a goal centered on continuous improvement and organizational excellence for the program.
Year 2 Goals and Subgoals Goal Goal 1.
Make timely data easily available
Goal 2.
Improve the usability, quality and consistency of our data
Goal 3.
Support increased use of data in decision-making
Goal 4.
Identify and foster innovations in open data and data use
Goal 5.
Continuously improve, scale, maintain and monitor our work
Subgoals (where appropriate) 1. Increase number and timeliness of datasets on SF OpenData 2. Enable use of private data, while appropriately protecting it 3. Streamline internal data access
1. Increase internal capacity 2. Support public capacity 3. Foster and incent a data culture
Data in San Francisco: Meeting supply, spurring demand - Return to Top
17 of 40
These goals continue to align with the three core challenges we identified for effective data use: 1) knowing what data we have, 2) having effective and efficient means of accessing it and 3) using data effectively. Challenges Knowledge
Access
Ability
Goal 1. Make timely data easily available Goal 2. Improve the usability, quality and consistency of our data Goal 3. Support increased use of data in decision-making Goal 4. Identify and foster innovations in open data and data use Goal 5. Continuously improve, scale, maintain and monitor our work The following sections describe the strategies in support of these goals. Appendix C provides a link to a quarterly timeline and set of milestones for Year 2 and Appendix D provides a cross walk with our open data policy that details how we are meeting the provisions of the legislation.
Data in San Francisco: Meeting supply, spurring demand - Return to Top
18 of 40
Goal 1.
Make timely data easily available
A precursor to using data is access. Open data, published on a shared platform, is one way of making data available. In the near term we need to ensure that we are publishing or plan to publish the City’s data when allowed. We should also publish the data at a frequency that matches the rate of data change. For example, datasets that change daily should be refreshed daily. Some data is only allowed to be shared internally as it may be protected by law or is not available to be published in the near term. For these datasets, we need to ensure that we have effective and efficient means of accessing and sharing data when it is appropriate to do so.
Subgoal 1.1 Increase number and timeliness of datasets on SF OpenData Strategy 1.1. Continue to mature our program to automate publication of data. One of the key challenges in opening data is extracting it from legacy systems and then preparing it for broader consumption. Older systems were not designed with data exporting or sharing in mind. Proprietary data formats need to be converted into modern, open formats, or the data may need to be reorganized or structured in a way that supports public distribution. Lastly, the processes that extract, transform and load data should be automated, such that after the initial configuration, we have little to no overhead other than monitoring the ongoing process. In sum, our automation program (activities summarized as extract, transform and load - ETL) is a critical part of our overall program as it will support the key processes that ensure our data is extracted appropriately and published in a timely manner on DataSF. While we made excellent progress in Year 1, we need to ensure that our program continues to develop to obtain economies of scale and to be sustainable. Key elements of this will be to formalize business processes via automation, develop dedicated resources, scale and standardize our technical implementation, and track and measure program performance. Strategy 1.2. Develop self-service model for data automation for large departments. While, we’ve committed to providing data automation as a central service, we recognize that some departments are capable of and should have control over their data automation work. At the same time, we want to ensure consistency and quality in the automation of data. Developing a self-service model will help us obtain both goals. Strategy 1.3. Target departments for wholesale data automation. D uring our data inventory (when we listed all datasets held across the city), we included a step that covered a list of systems. Our analysis of this list suggests that some departments are good candidates for wholesale automation - that is, their technical environment is homogenous and they have a key technical contact that can streamline the work. For these departments, we will seek to automate the publication of their data as a single project (versus relying on department publishing plans).
Data in San Francisco: Meeting supply, spurring demand - Return to Top
19 of 40
Strategy 1.4. Develop a geographic data access and publishing strategy. Our experience in Year 1, suggested that we need to have a distinct strategy for publishing geographic data, in particular data that consists of polygons (shapes/boundaries like police districts) and polylines (lines like streets). The canonical geographic datasets in the City are broadly used both internally and externally and have particular shared value that requires a more deliberate process for publishing (including geographic tools), data management, and communicating metadata. The Department of Technology’s GIS team is a key partner in this work. Strategy 1.5. Establish methods to ensure SF licensing and publication of data for new information systems. While extracting data from legacy systems is painful, new systems should be built with open data as a standard output. Any new information system should be required to have automated outputs to support broader publication and dissemination of the city’s data, while retaining the appropriate licensing. In Year 1, we were surprised to find little to no best practices in this area. As a result we shifted the timing of this work and will seek to complete it in Year 2.
Subgoal 1.2 Enable use of private data, while appropriately protecting it Strategy 1.6. Create “ShareSF” hub and develop supporting resources and business processes. As mentioned in the looking back section, we pivoted to a broader strategy for confidential data sharing in Year 1. As part of that strategy, our office was tasked with developing a “ShareSF” hub to facilitate internal confidential data sharing. Under this strategy, we will develop the programmatic components of a hub, including standard business processes, shared resources, legal frameworks, and governance. Strategy 1.7. Explore technical solutions for confidential data sharing. T he “ShareSF” strategy also calls for the exploration of technical solutions to confidential data sharing. While we expect this to have partial overlap with Strategy 1.11 below, we anticipate specific needs and requirements related to implementing technical controls for legally protected data. Strategy 1.8. Create a process for accessing your individual data. A process for accessing data that the City holds about you will increase transparency and may help improve data quality. Our work in Year 1 suggested that this is best incorporated into existing systems and processes for data and information requests. As a result, we expect to wrap this process up in Year 2 and will focus on guidance and outreach to educate departments on this type of request.
Data in San Francisco: Meeting supply, spurring demand - Return to Top
20 of 40
Subgoal 1.3 Streamline internal data access Through our City Analyst survey we have quantified the need for more effective and efficient means of accessing data between departments. While the open data portal is a key repository that we expect to leverage, some data either will not yet be available on the data portal or will be available in a format that is less amenable to internal data work. In some cases, this subgoal will complement Subgoal 1.2 for private data sharing. Strategy 1.9. Develop methods to connect internal users to datasets. T he dataset inventory we created in Year 1 was a major step towards understanding the scope of the City’s data holdings and addressing one of the key barriers - knowledge of data. Now that we have this list, not only do we need to maintain it, we need to leverage it to support internal data access. While other methods may emerge, tools built on top of the data inventory can support internal data access for datasets that are not yet published (or not published in the best format for internal use). Strategy 1.10. Integrate internal data access needs into emerging technology strategies. As part of the Committee on Information Technology (COIT), the City has embarked on two key strategies: 1) Shared services and 2) Public experience. We will participate in the development of these strategies to ensure that the data access challenges we have identified are addressed in these broader, long-term strategies. Strategy 1.11. Explore options to develop shared data systems for internal use. The number and variety of backend systems in the City is vast. While the open data portal may be one shared system, we would like to explore options related to a more robust enterprise layer for data access and management.
Data in San Francisco: Meeting supply, spurring demand - Return to Top
21 of 40
Goal 2.
Improve the usability, quality and consistency of our data
While Goal 1 provides access to the City’s data, the ultimate value of the data depends on its usability, quality, and consistency. Usability helps us understand the data - what is it, how is it collected, when is it published - the basic documentation that supports use of the data. Quality speaks to how reliable and complete the data is - can we trust the conclusions or decisions we make based on the data? C onsistency helps us combine data from different systems, by using consistent definitions across datasets, whether it’s race or ethnicity, service categories, target populations, location, etc. Strategy 2.1. Develop comprehensive data quality strategy for the City; implement via pilots and broader COIT strategies. Our Year 1 experience suggested that the City would benefit from a data quality framework and roadmap. We expect this to be a multi-year strategy in terms of development and execution. Over the next year we will identify motivated pilots to roll out our data quality strategy. Research suggests that aligned pilots over time are the most effective way to pursue a broader data quality approach. Pilots will likely include data consistency standards, data model alignment, and data management guidance and tools. As mentioned in Strategy 1.10, the City has embarked on two key technology strategies: 1) Shared services and 2) Public experience. These strategies represent an additional opportunity to insert codified data quality practices and policies into a broader strategy. Strategy 2.2. Conduct targeted data quality improvements. During the middle of last year, we adopted this as a new strategy. Our central position in the City allows us to identify cross-department data quality concerns. As a result, we will occasionally participate in and even lead, if needed, a targeted effort to improve data quality. While this strategy is no substitute for a broader strategy, it can fill certain critical data gaps. Strategy 2.3. Provide mechanisms to elicit and track feedback and learnings from data users. We discovered in Year 1 that we had a paucity of feedback mechanisms. While creating our help desk was a first step, we need richer and more scalable approaches for user feedback. Some of these we expect from our vendor, but others may require new tools, partnerships, or types of engagements. New tools may include testing social data dictionaries or data wiki pages. And we must also explore offline options for engagement (e.g. working groups).
Data in San Francisco: Meeting supply, spurring demand - Return to Top
22 of 40
Goal 3.
Support increased use of data in decision-making
Once data is available, we need to use it. Effective use consists of individual and department capacity as well as a broader public capacity for using data in decision-making. Capacity consists of shared data and access, as well as data literacy, analytics, managing with data, and displaying and communicating data. We need to match the availability of data with the capacity to use data, both in terms of people and technology.
Subgoal 3.1 Increase internal capacity Strategy 3.1. Grow Data Academy and explore methods to institutionalize as part of professional development. Last fall we launched the Data Academy in partnership with the City Services Auditor (CSA). The demand for courses has been high with every course at capacity and with a waitlist. For Year 2, we want to add classes, bring in external trainers, and explore ways to leverage massive open online courses. Part of the curriculum extension will be to incorporate classes that are targeted at managerial and leadership roles. In addition, we want to explore integrating Data Academy courses into formal training venues or as part of job series. Viewing data literacy as a professional development strategy versus a series of ad hoc trainings will be key to transforming data capacity across the City - at both department and individual staff levels. Strategy 3.2. Provide enduring materials and resources for data tools and techniques. While the Data Academy provides an opportunity for direct training, we want to supplement that with enduring resources that are available outside of the classroom and to serve a broader audience. In particular, we will explore how to showcase tools or other resources and provide supplementary materials, e.g. guidance or tool guides. A particular focus will be on geographic/mapping tools as well as data quality tools. This could also include exploring means to better distribute previous analyses or work. Strategy 3.3. Help establish department stat programs based on department readiness. We will continue to partner with the City Services Auditor and the strengthened Performance Management team within the office. We expect to be in a supporting and partnering role and will focus on enhancing or extending their work, not leading. Strategy 3.4. Explore opportunities to supplement analytical capacity. W hile the City has a great deal of analytical talent, we are interested in enhancing both the amount and type of analytical capacity. Opportunities may exist for partnerships with external organizations, working with volunteers, issuing challenges or enhancing existing staff.
Data in San Francisco: Meeting supply, spurring demand - Return to Top
23 of 40
Subgoal 3.2 Support public capacity Strategy 3.5. Continue to develop our portfolio of transparency tools and websites. Transparency tools and websites go beyond simply publishing data to transforming the data into information that can be consumed and understood by the general population. The H ousing Data Hub is one example. Each of these tools provides policy makers and the public with ready access to City data contextualized and presented in a way that informs decision-making. Typically, these sites are built on open data. We will continue to develop our own sites as well as partner and/or promote sites being built by City departments. Strategy 3.6. Explore methods to increase public capacity for data use. Transparency websites are one form of capacity building, but they rely on a single channel, a website, to engage the public about City data. We are interested in exploring other methods, whether it is trainings at the Library, workshops at community or neighborhood events, or collaborative problem-solving. We expect any additional methods will also increase our own capacity to present the City’s data more effectively and to be more responsive to the broader community.
Subgoal 3.3 Foster and incent a data culture Strategy 3.7. Explore the creation of shared frameworks for data and evaluation. A common language and approach to data-driven decision making can help set a roadmap and ease the effort needed from departments. For example, imagine if any new initiative included a data, evaluation and performance management strategy. This goes beyond simply requiring evaluation to the continuous measuring and retooling of policies and programs based on a stream of real time data and experimentation integrated into program management and processes. Instead of a pre/post appendage - data and evaluation is part of the team. We will explore creating a shared framework to inform the launch of new programs, including defining key outcomes, the data and evaluation plan, and performance management needs. For example, a data plan could address data sourcing and collection needs, data sharing requirements and data model creation. It could also address how to integrate data needs into business processes and technical systems. Lastly, it could discuss how to create management tools, including measures, dashboards, staffing and business processes. The framework could be implemented or tested in a variety of ways from pilots to training to policy. Strategy 3.8. Explore creation of data-related peer networks. Data-related peer networks could help foster cross-department problem solving by connecting colleagues with related domain expertise. Employees could share ideas for data use and tools and also identify opportunities to collaborate on cross-department data initiatives. Strategy 3.9. Communicate the benefits of data-driven decision-making. Clarifying the Data in San Francisco: Meeting supply, spurring demand - Return to Top
24 of 40
value of data-driven decision-making and the tangible benefits requires storytelling. And within the City, we’ve heard that one of the primary drivers to adopt Stat programs was hearing what other groups are doing. We need to be better at collecting and communicating stories about effective data use. Not only does this spur new ideas, it showcases the teams that are doing good work, thereby encouraging more.
Data in San Francisco: Meeting supply, spurring demand - Return to Top
25 of 40
Goal 4.
Identify and foster innovations in open data and data use
The pace of change in the open data, analytics, and visualization spaces is breathtaking. We need to not only ensure we are aware of innovations, but we need to selectively identify and nurture innovation in order to ensure that the City and our stakeholders benefit from changes in technology and the experiences of others. Strategy 4.1. Maintain ongoing reviews of best practices and the changing technology landscape. To ensure that San Francisco maintains its leadership position in open data, we have to stay abreast of emerging best practices and changes in technology that can better support or even transform our program. In part, this will be a natural result of our communications and engagement strategy, but retaining it as a specific strategy will help ensure that we are making regular and conscious efforts to assess the rapidly changing landscape. This approach was validated in Year 1, as our quarterly technology landscape sessions resulted in several pivots or technology changes. Strategy 4.2. Target opportunities to improve data-centric services. The City provides a variety of services and some of these are heavily mediated by data and/or technology and may be cross-departmental. Our experience in Year 1 showed that we have a role to play in guiding or informing these types of projects. As this type of work risks stretching our capacity, we will have criteria for participating, including expected impact and level of departmental resources and commitment. Wherever possible, we will roll the projects or the lessons learned into the larger shared services and public experience strategies discussed in Strategy 1.10. Strategy 4.3. Selectively partner in or promote data-centric initiatives. Through our engagement strategy and ongoing reviews we hope to identify opportunities for targeted data initiatives or partnerships that involve organizations or people outside of the City. We believe external organizations or perspectives may bring a new approach to existing City challenges or help extend City services. We will also seek opportunities to collaborate with other governments. Part of this work will be to develop clear criteria on when and how we should participate in partnerships as well as methods to elicit external help.
Data in San Francisco: Meeting supply, spurring demand - Return to Top
26 of 40
Goal 5.
Continuously improve, scale, maintain and monitor our work
A culture of continuous improvement ensures that we always work to identify where and how we can improve. In some cases, this may be a deliberate choice to not improve if the benefits are less than the effort required. In addition, due to the small size of our team, we need to deliberately seek ways to scale our work both in execution and impact. During each project or activity, we continuously ask ourselves - can we scale this? If the answer is no, we need to change or on occasion, limit our effort. Lastly, any work we have accomplished needs a deliberate maintenance strategy if we have future need for it. Activity 5.1. Maintain data catalogs. The dataset inventory that we completed in Year 1 was an enormous undertaking. We need to maintain the resulting list so we can use it to broadly facilitate internal data access and to track data as it changes over time. Activity 5.2. Maintain, and iterate as needed, methods for prioritizing datasets. We will need to fully deploy and then maintain our various methods for prioritizing the publication of datasets. If new methods emerge, we will incorporate them into our plan. Activity 5.3. Continuously improve our web presence and supporting processes and materials to better meet the needs of our users. While we will seek to increase the means in which we engage users, our website and supporting tools will likely remain the key point of interactions. As such, we must ensure that they are meeting the needs of our many users, including data publishers, consumers and residents. Activity 5.4. Continue to partner with Socrata to inform the development of the portal. SF OpenData, our data portal, is a key part of our web presence and how we meet the needs of our users. We will continue to partner with our open data portal vendor to incorporate our user’s needs into the portal’s roadmap. Activity 5.5. Continuously improve outreach and support for Data Coordinators and publishers. We need to continue to support our Data Coordinators. We do expect our support of data publishers to increase in Year 2 both due to the expected increase in publication post dataset inventory and to continuously improve the publishing process. Activity 5.6. Grow and broaden communications and engagement activities. I n Year 1, our communications and engagement was focused largely on completing the dataset inventory and engaging our Data Coordinators. In Year 2, we must grow the scope and nature of our outreach. Not only did our survey suggest we are not reaching key parts of City staff, we know that we are not engaging most neighborhoods and communities writ large. Now that we have the key digital channels in place (social media, blog, website), we can build and extend our work. The core goal in this strategy is to broaden awareness and then use of the tools we are providing.
Data in San Francisco: Meeting supply, spurring demand - Return to Top
27 of 40
Activity 5.7. Track and measure our progress. In Year 1, we established a framework and set of metrics for tracking our work. We need to maintain that work, automate reporting wherever possible, and make changes as our work evolves. In addition, this requires some level of conscious data collection, whether through surveys, workshops or case studies. Activity 5.8. Conduct ongoing planning. To ensure our work is on track, we must conduct ongoing planning. Last year we established monthly and quarterly planning meetings that ensured we were meeting the goals of our workplan or if needed, reevaluating our approach.
Data in San Francisco: Meeting supply, spurring demand - Return to Top
28 of 40
5. Priority, Resource and Contingency Analysis
The Open Data Ordinance mandates some of our activities, while others are either in the critical path for broader work or a key part of setting a platform for future success. As a result, we prioritized our strategies using the MoSCow method in the context of what we think we must accomplish in Year 2 (M=Must, S=Should, C=Could).2 This does not mean that certain activities will not become “musts” or “shoulds” over time. We then identified resource gaps as follows: ● ● ●
No - no resource gap Yes - we do not believe we can be successful with existing resources Partial - the strategy can be supported at some level with current resources, but should be supplemented to ensure success
We then characterized the gap based on type of need: ● ●
Ongoing - requires a sustainable resource plan as we expect to be actively developing or maintaining this activity over the mid to long term Project - requires a one time solution to resource
Lastly, the table includes a brief contingency strategy if we are unable to close the resource gap. Table: Prioritization, Gap Analysis and Contingency Plan Strategy
M
Strategy 1.1. Continue to mature our program to automate publication of data.
X
Strategy 1.3. Target departments for wholesale data automation.
X
Strategy 1.5. Establish methods to ensure SF licensing and publication of data for new information systems
X
Strategy 1.2. Develop self-service X model for data automation for large departments.
Strategy 1.4. Develop a geographic X data access and publishing strategy.
Strategy 1.6. Create “ShareSF” X hub and develop supporting resources and business processes.
S
C Gap
Type of Need
Contingency Strategy if Unable to Close Gap
Partial Ongoing We plan to close this gap by hiring a new role for open data services. This gap will exist until we complete the hire and onboarding. Partial Ongoing If we are unable to hire the right mix of skills we will plan to reallocate responsibility among Partial Project existing staff and selectively partner with a handful of Partial Ongoing departments with related expertise.
Partial Project
We will seek external and internal partners for help developing this.
Partial Ongoing We plan to hire later this year and that will partially close the gap; We will also rely on key department partnerships and will seek external
MoSCoW prioritization is traditionally used in software development to determine what requirements you Must have, Should have, Could have, and Won’t have. In our case, we used it to prioritize our activities. 2
Data in San Francisco: Meeting supply, spurring demand - Return to Top
29 of 40
funding. Strategy 1.7. Explore technical solutions for confidential data sharing.
X
Partial Ongoing We will scale to our capacity and may seek external funding.
Strategy 1.8. Create a process for accessing your individual data.
X
Partial Project
We will rely on interns and partner with the Public Information Officers to complete this.
Strategy 1.9. Develop methods to connect internal users to datasets.
X
Partial TBD
Strategy 1.10. Integrate internal data access needs into emerging technology strategies.
X
No
X
No
X
Partial Ongoing We will seek department partners and we may seek external funding.
X
Partial Ongoing We will only engage in projects with department engagement and resource commitment.
Strategy 1.11. Explore options to develop shared data systems for internal use.
Strategy 2.1. Develop comprehensive data quality strategy for the City; implement via pilots and broader COIT strategies Strategy 2.2. Conduct targeted data quality improvements.
Strategy 2.3. Provide mechanisms to elicit and track feedback and learnings from data users. Strategy 3.1. Grow Data Academy and explore methods to institutionalize as part of professional development. Strategy 3.2. Provide enduring materials and resources for data tools and techniques.
We will tailor sub-projects to scale to our capacity and will reexamine based on need.
Partial Ongoing We will scale for our capacity and seek external partners to help frame and move this forward.
X
X
X
Strategy 3.3. Help establish department stat programs based on department readiness.
Partial Ongoing We will seek to expand our department partnership to HR and also explore bringing in external teachers for advanced topic areas.
Partial Ongoing We will scale subprojects based on our capacity and encourage departments to contribute. X No
Strategy 3.4. Explore opportunities to supplement analytical capacity.
X Partial Ongoing We will scale for our capacity and seek external funding.
Strategy 3.5. Continue to develop our portfolio of transparency tools and websites.
X
Strategy 3.7. Explore the creation of shared frameworks for data and
X
Strategy 3.6. Explore methods to increase public capacity for data use.
Partial Ongoing We will scale for our capacity, seek external funding, and require committed department partners.
X Partial Ongoing We will scale for our capacity and seek external funding.
Partial Ongoing We will scale for our capacity, seek external funding, and seek
Data in San Francisco: Meeting supply, spurring demand - Return to Top
30 of 40
department partners.
evaluation. Strategy 3.8. Explore creation of data-related peer networks.
X No
Strategy 3.9. Communicate the benefits of data-driven decision-making.
Strategy 4.1. Maintain ongoing reviews of best practices and the changing technology landscape.
X
X
Partial Ongoing We will scale for our capacity and may seek external funding or partnerships. No
Strategy 4.2. Target opportunities to improve data-centric services.
X Partial Ongoing We will scale for our capacity and require committed department partners.
Strategy 4.3. Selectively partner in or promote data-centric initiatives
X Partial Ongoing We will scale for our capacity and may seek external funding or partners.
Data in San Francisco: Meeting supply, spurring demand - Return to Top
31 of 40
6. Conclusion
Data can feel dry, boring and academic. At the same time, everyone loves a good story. But every story has a rich vein of data threaded throughout, describing a pattern and illuminating a path forward. It’s only when we link the data narratives that underlie our stories that we are able to make new connections that lead to new insights about what is working or what is possible. This plan is not about data for data’s sake. This plan is about transforming how we enrich our understanding, our experience and our City with data.
Data in San Francisco: Meeting supply, spurring demand - Return to Top
32 of 40
Appendices Appendix A. Acknowledgements
A number of people, too numerous to list, have contributed to our work, our thinking and our inspiration. Below are a handful of thanks - we may have missed some, if so, our apologies! Our local brigade, Code for San Francisco, has become a fantastic partner - we learn from and with them and value the relationships that have developed. Thank you especially to Jesse Biroscak, Maddie Suda, Julio Feliciano, Judy van Soldt, and Katherine Nemacher. Many thanks to my colleagues in other places for sharing their worries, their challenges and their solutions. I love that we are on this journey together! Barbara Cohn, Stuart Drown, Laura Meixell, Abhi Nemani, Andrew Nicklin, Maksim Pecherskiy, Tom Schenk, and Tim Wisniewski. Our Internal Advisory Group provided guidance and strategic direction. Many thanks to Carmen Chu, Miguel Gamiño, Luis Herrera, Kate Howard, Steve Kawa, Ed Reiskin, and Ben Rosenfield. The following people have become friends and thought partners throughout this process: Anthony Ababon, Krista Canellakis, Cyndy Comerford, Ted Conrad, Jason Cunningham, Rebecca Foster, Luke Fretwell, Jane Gong, Kate Howard, Chanda Ikeda, Matthias Jaime, Lani Kent, Kelly Kirkpatrick, Carol Lu, Andy Maimoni, Ashley Meyers, Jay Nath, Tajel Shah, Chris Simi, Peg Stevenson, John Tucker, Marisa Pereira Tully, and Melissa Whitehouse. The following folks have been key parts of making everything happen. Their insight, commitment, and persistence have helped all we have done be successful this year: Jason Lally, Jeff Johnson, Samuel Valdez, Sherman Luk, Jessie Rubin, Andrew Ju, Kyle Patterson, Laura Marshall and Kyra Sikora. Read more about their contributions here: http://datasf.org/about/. And we’ve had an amazing stream of interns who have been critical to so many projects. Thank you each for your energy and commitment: Peri Weisberg, Erica Finkle, Laura Gerhardt, Christina Malamut, Charlotte Hill, Dan Wilcox, Evgenia Likhovtseva, and Marcelo Milanello. Read more about their contributions here: http://datasf.org/about/. Last, but so far from least our Data Coordinators - our core activity and output from Year 1 would not exist without the collective effort of our Data Coordinators and other supporting staff: Mullane Ahern, Darrell Ascano, Colleen Burke-Hill, Carol Chapman, Eddy Ching, Mike Choi, Joanne T. Chou, Marina Coleridge, Robert Collins, Elise Crane, Keith DeMartini, Matt Dorsey, Sarah Duffy, Tiarra Earls, Kevin Edwards, Penni Eigster, Sandra Eng, Cheong-Tseng Eng, Cynthia Goldstein, Zihong Gorman, Brandon Grissom, Michele Gutierrez-Canepa, John Halpin, David Hardy (LT), Kurian Joseph, Jennifer (Zoey) Kroll, Michael Lambert, Craig Lee, Alexander Levitsky, Brent Lewis, Thomas Lindman, Ferry Lo, Jose Luis Perla, Andy Maimoni, Maria X Martinez, Steven Massey, Eddie McCaffrey, Maria McKee, Jesus Mora, John Murray, Wilson Ng, Stephanie Nguyen, Eric Pawlowsky, Jeff Pera, Joshua Raphael, Stacy T. Robson, Guillermo Rodriquez, Leah Rothstein, Ken Salmon, Valeri Shilov, Mitch Sutton, Marianne Thompson, Charles Thompson, Anne Trickey, Alan Tse, Tyler Vu, Mike Webster, Chris Data in San Francisco: Meeting supply, spurring demand - Return to Top
33 of 40
Wisniewsky, Gloria Woo, Mike Wynne, and Theresa Zighera.
Appendix B. Detailed Accomplishments in Year 1
For each of our goals and strategies from Year 1, we highlight the key accomplishments and status for each strategy. Our quarterly milestones (google doc) represents an accounting by quarter of our milestones and our progress on them per our strategic plan. The quarters cover Fiscal Year 2014-2015, which started on July 1, 2014 and completed on June 30, 2015. Below is a high level summary for each strategy by goal.
Goal 1: Increase number and timeliness of datasets on DataSF Strategy
Strategy 1.1. Establish the role of data coordinators and support development of data catalogs.
Key Accomplishments / Status ● ● ●
Strategy 1.2. Develop methods to inform the prioritization of datasets for publication.
● ●
Strategy 1.3. Develop metrics to track and measure progress in publishing open data.
● ● ●
Strategy 1.4. Develop our program to automate publication of data.
● ● ● ●
Strategy 1.5. Develop an outreach and support program for data coordinators and other data publishers.
● ● ● ●
Strategy 1.6. Establish methods to ensure SF licensing and publication of data for new information systems.
Goal 2: Improve usability of DataSF Strategy
Strategy 2.1. Better leverage existing services and features from Socrata.
Strategy 2.2. Partner closely with Socrata to inform the development of the portal.
●
52 data coordinators appointed (75%) of department inventories complete Inventory published on SF OpenData
Developed 4 methods to prioritize datasets Streamlined data nomination process and deployed help desk and request tracking
Developed progress measures and KPIs to support quarterly report Developed evaluation framework for measuring impact of open data Coming soon: Public launch of department publishing plans and automated reporting
Developed business case in partnership with DT Created program and services model Established technology, business processes and support documents Secured full time resource for program Created Data Coordinator Portal and supporting tools, templates and training Created Publisher Portal with standard publication process and training Developed a submission process and packet Created series of guidebooks and conducted in person and online trainings In progress, project was delayed due to lack of best practices and resource constraints
Key Accomplishments / Status ●
Conducted analysis and rolled into other strategies
● ● ● ●
Joined customer advisory board Participated in usability testing of new key feature Actively participate in roadmap and direction Participate in monthly roadmap meeting
Data in San Francisco: Meeting supply, spurring demand - Return to Top
34 of 40
Strategy 2.3. Redesign our web presence and supporting processes and materials to better meet the needs of our users.
● ● ●
Redesigned and launched new data portal Launched new web home Early partner in technology preview for new dataset design
Goal 3: Improve the usability, quality, and consistency of our data Strategy
Strategy 3.1. Establish metadata standards for published data.
Key Accomplishments / Status ●
Created and implemented new standard
Strategy 3.2. Establish mechanisms to elicit and track feedback and learnings from data users.
●
Analysis suggested gap; will deploynew methods in Y2
Strategy 3.3. Explore the creation of data quality processes and measures.
●
*NEW* Strategy 3.4 Conduct targeted data quality improvements
Conducted research and laid out approach for Y2, including a data playbook and identified initial partners or topics for pilots
●
*NEW* Strategy 3.5 Reset and standardize datasets on DataSF
● ●
Worked to incorporate inclusionary housing program data needs into upstream planning business process; turned into broader housing data pipeline project that will extend into Y2 Complete and in monitoring mode Created standard guidelines
Goal 4: Enable use of private data, while appropriately protecting it Strategy
Strategy 4.1. Create a data classification and sharing standard. *REVISED* Develop a strategy to enable internal data sharing Strategy 4.2. Create a process for accessing your individual data.
Key Accomplishments / Status ● ● ●
In partnership with CSA, convened departments in HSS and developed a multi-year strategy Obtained dedicated resources to support going forward Modified strategy to leverage existing processes; will deploy in Y2
Goal 5: Support increased use of data in decision-making Strategy
Strategy 5.1. Establish a training curriculum to support increased use of data in decision-making.
Key Accomplishments / Status ● ●
Strategy 5.2. Help establish department stat programs based on department readiness; codify lessons learned and materials for broader use
●
● ●
In partnership with CSA, Launched Data Academy in Fall, all classes booked out with waiting lists Done department trainings after being approached by depts
Partnered with CSA to: ○ Develop 2 case studies of department Stat programs ○ develop assessment tool and guidebook to creating stat programs Piloted approach in department, which is in progress Will launch cumulative work as the Stat Starter Kit in early Y2
Data in San Francisco: Meeting supply, spurring demand - Return to Top
35 of 40
Strategy 5.3. Continue to develop our portfolio of transparency tools and websites.
●
Developed the Housing Data Hub in partnership with multiple departments
Goal 6: Identify and foster innovations in open data and data use Strategy
Strategy 6.1. Develop and maintain a communications and engagement strategy. Strategy 6.2. Conduct ongoing reviews of best practices and the changing technology landscape.
Key Accomplishments / Status ● ● ● ●
Conducted analysis and plan Increased twitter following Established blog Created CDO listservs
●
Conducted review and codified quarterly process
Strategy 6.3. Identify and enable targeted data-centric initiatives.
●
Strategy 6.4. Establish a data licensing framework and standard.
Working to automate and analyze housing inspections data from 3 departments; will explore extension of work in Y2
● ● ●
Completed analysis and made recommendation Obtained legal agreement with recommended standard Rollout and transition strategy underway
Data in San Francisco: Meeting supply, spurring demand - Return to Top
36 of 40
Appendix C. Quarterly Milestones for Year 2
For each of our strategies, we outline a set of quarterly milestones and expected resources. Adjustments to the milestones may occur based on resources or other factors as discussed in Section 5. You can view the milestones and related timeline in a google spreadsheet.
Appendix D. Crosswalk between plan and Open Data Policy Sec. 22D.2. Chief Data Officer and City Departments (a) Chief Data Officer #
Clause
Implementation
(a)
Chief Data Officer. In order to coordinate implementation, compliance, and expansion of the City's Open Data Policy, the Mayor shall appoint a Chief Data Officer (CDO) for the City and County of San Francisco. The CDO shall be responsible for drafting rules and technical standards to implement the open data policy, and determining within the boundaries of law which data sets are appropriate for public disclosure. In making this determination, the CDO shall balance the benefits of open data set forth in Section 22D.1, with the need to protect from disclosure information that is proprietary or confidential and that may be protected from disclosure in accordance with law. Nothing in the rules and technical standards shall compel or authorize the disclosure of privileged information, law enforcement information, national security information, personal information, unless required by law. Nothing in the rules or technical standards shall compel or authorize the disclosure of information which is prohibited by law.
This document serves to meet the general expectations. Subgoal 1.2 will protect proprietary or confidential information.
(b)
The CDO's duties shall include, but are not limited to the following:
-
(b)(1)
Draft rules and technical standards to implement the open data policy ensuring the policy incorporates the following principles:
(b)(1)(A)
(A) Data prioritized for publication should be of likely interest to the public;
(b)(1)(B)
(B) Data sets should be free of charge to the public through the web portal; Existing practice
(b)(1)(C)
(C) Data sets shall not include privileged or confidential information, law Managed via publication enforcement information, national security information, personal information, process and Subgoal 1.2. proprietary information or information the disclosure of which is prohibited by law; and
(b)(1)(D)
(D) Data sets shall include, to the extent possible, metadata descriptions, Complete and managed API documentation, and the description of licensing requirements. Common via publication process. core metadata shall, at a minimum, include fields for every dataset's title, description, tags, last update, publisher, contact information, unique identifier, and public access level as defined by the CDO.
(b)(2)
(2) Coordinate, maintain, and update the City's Open Data website, currently known as "DataSF";
(b)(3)
(3) Present the Open Data rules and technical standards to the Committee COIT is the forum used to pass r ules and technical on Information Technology (COIT) for adoption; standards.
(b)(4)
(4) Provide education and analytic tools for City departments to improve and assist with the release of open data to the public;
Data in San Francisco: Meeting supply, spurring demand - Return to Top
Deployed via Strategy 1.2 in FY14-15; Maintained via Activity 5.2 in FY15-16.
See Activity 5.3.
See Strategies 1.1, 1.2, 1.3, 1.4, and Activity 5.5.
37 of 40
(b)(5)
(5) Assist departments by collecting and reviewing each department's open data implementation plans and creating a template for the departmental quarterly progress reports;
(b)(6)
(6) Present an annual citywide implementation plan to COIT, the Mayor, This plan will be and Board of Supervisors and respond, as necessary, to inquiries regarding presented to all of these groups. the implementation of the open data policy and the compliance of departments with the deadlines established in this section.
(b)(7)
(7) Help establish data standards within and outside the City through collaboration with external organizations;
New standards will be developed as needed.
(b)(8)
(8) Assist City departments with analysis of City data sets to improve decision making;
See Goal 3
(b)(9)
(9) Establish a process for providing citizens with secure access to their private data held by the City;
See Strategy 1.8
(b)(10)
(10) Establish guidelines for licensing open data sets released by the City Complete, will formalize via COIT standard. and evaluate the merits and feasibility of making City data sets available pursuant to a generic license, such as those offered by "Creative Commons." Such a license could grant any user the right to copy, distribute, display and create derivative works at no cost and with a minimum level of conditions placed on the use; and,
(b)(11)
(11) Prior to issuing universally significant and substantial changes to rules and standards, solicit comments from the public, including from individuals and firms who have successfully developed applications using open data sets.
(b) City Departments
Complete and maintained via Activity 5.5.
Standard practice; Rules and standards will also be presented to COIT, a public forum
#
Clause
Implementation
(b)
Each City department, board, commission, and agency ("Department") shall:
-
(b)(1)
Make reasonable efforts to make publicly available all data sets under the Department's control, provided however, that such disclosure shall be consistent with the rules and technical standards drafted by the CDO and adopted by COIT and with applicable law, including laws related to privacy;
Supported by Strategies 1.1-1.4 and Activity 5.5.
(b)(2)
Review department data sets for potential inclusion on DataSF and ensure they comply with the rules and technical standards adopted by COIT;
Complete and maintained by Activity 5.5.
(b)(3)
Designate a Data Coordinator (DC) no later than three months after the Complete effective date of Ordinance No. 285-13, who will oversee implementation and compliance with the Open Data Policy within his/her respective department. Each DC shall work with the CDO to implement the City's open data policies and standards. The DC shall prepare an Open Data plan for the Department which shall include:
(b)(3)(A)
A timeline for the publication of the Department's open data and a summary Publication plans are publicly available and of open data efforts planned and/or underway in the Department; updated bi-annually.
(b)(3)(B)
A summary description of all data sets under the control of each Department Complete other than rolling
Data in San Francisco: Meeting supply, spurring demand - Return to Top
38 of 40
(including data contained in already-operating information technology systems);
acceptances from departments; Data Inventory available on SF Open Data.
(b)(3)(C)
All public data sets proposed for inclusion on DataSF;
See previous
(b)(3)(D)
Quarterly updates of data sets available for publication.
Centralized through publishing program
(b)(4)
The DC's duties shall include, but are not limited to the following:
(b)(4)(A)
No later than six months after the effective date of Ordinance No.285-13, Complete, though publish on DataSF, a catalogue of the Department's data that can be made accepting r olling public, including both raw data sets and application programming interfaces submissions ("API's").
(b)(4)(B)
Appear before COIT and respond to questions regarding the Department's compliance with the City's Open Data policies and standards;
Will be done as needed
(b)(4)(C)
Conspicuously display his/her contact information (including name, phone number or email address) on DataSF with his/her department's data sets;
Supported by central help desk to facilitate tracking and formalized via Strategy 2.3.
(b)(4)(D)
Monitor comments and public feedback on the Department's data sets on a timely basis and provide a prompt response;
See previous
(b)(4)(E)
Notify the Department of Technology upon publication of any updates or corrective action;
Existing practice
(b)(4)(F)
Work with the CDO to provide citizens with secure access to their own private data by outlining the types of relevant information that can be made available to individuals who request such information;
See Strategy 1.8
(b)(4)(G)
Implement the privacy protection guidelines established by the CDO and hold primary responsibility for ensuring that each published data set does not include information that is private, confidential, or proprietary; and
Supported by publication process and Strategies 1.1-1.4 and Activity 5.5.
(b)(4)(H)
Make reasonable efforts to minimize restrictions or license-related barriers on the reuse of published open data.
City wide license adopted in FY14-15.
(c) Department of Technology #
Clause
(c)
The Department of Technology (DT) shall provide and manage a single Current practice; Managed Internet site (web portal) for the City's public data sets (http://data.sfgov.org by OCDO. or successor site), called "DataSF." In managing the site, DT shall:
(c)(1)
Publish data sets with reasonable, user-friendly registration requirements, Current practice license requirements, or restrictions that comply with the rules and technical standards drafted by the CDO and adopted by COIT;
(c)(2)
Provide mechanisms for departments to indicate data sets that have been recently updated;
Current practice
(c)(3)
Include an on-line forum to solicit feedback from the public and to encourage public discussion on Open Data policies and public data set availability;
Current practice
(c)(4)
Forward open data requests to the assigned DC; and,
Current practice
Data in San Francisco: Meeting supply, spurring demand - Return to Top
Implementation
39 of 40
(c)(5)
Take measures to ensure access to public data sets while protecting DataSF from unlawful abuse or attempts to damage or impair use of the website.
Sec. 22D.3. Standards and Compliance
Current practice, though in practice this is managed by our vendor, Socrata
#
Clause
(a)
The CDO and COIT shall work with the Purchaser to develop contract See Strategy 1.5 provisions to promote Open Data policies. The provisions shall include rules for including open data requirements in applicable City contracts and standard contract provisions that promote the City's open data policies, including, where appropriate, provisions to ensure that the City retains ownership of City data and the ability to post the data on data.sfgov.org or make it available through other means.
(b)
The following Open Data Policy deadlines are measured from effective date During the passage of this policy, t he deadlines were of Ordinance No. 285-13: made dependent on the CDO hire
(b)(1)
Within three months, department heads designate Department Data Coordinators to oversee implementation and compliance with the Open Data Policy within his/her respective department;
Complete
(b)(2)
Within six months, each Department shall begin conducting quarterly reviews of their progress on providing access to data sets requested by the public through the designated web portal;
1/4ly reviews are automated via public publishing plans available online
(b)(3)
Within six months, each Department shall publish on DataSF a catalogue of Complete per extended timelines requested by their Department's data that can be made public, including both raw OCDO; ~25% of datasets and APIs; and departments have not completed inventory as of June 30, 2015
(b)(4)
Within one year, the CDO shall present updated citywide Open Data implementation plan to COIT, the Mayor and Board of Supervisors.
The Open Data plan will be presented per COIT meeting timeline
(b)(5)
The CDO may propose a modification, for adoption by COIT, of the timelines set forth in the legislation.
Was requested and approved for data inventory
Data in San Francisco: Meeting supply, spurring demand - Return to Top
Implementation
40 of 40