Planning for a Restricted Data Service Challenges & Lessons Learned STEPHEN WOODS, SARAH IRWIN, SARAH PICKLE MAY 20, 2016
The problem at Penn State No centralized location for research using restricted data
Defining restricted data
Why work with restricted data?
Barriers to use
Goal: create a restricted data service (RDS) that includes secure facilities, secure technology, and expert support.
Plan for the webinar
Learning from our peers (Sarah P.)
PSU landscape (Sarah I.)
RDS partnerships & the University Libraries (Stephen)
Learning from our peers
Microgrant from PSU Libraries To conduct site visits to institutions with established restricted data services and enclaves.
History, stakeholders, funding
Security of room, tech
Furniture
Services
Snapshot: the University of Virginia
Stakeholders: library, Curry School of Education
Funding: From library: $1,000 for surplussed furniture, server space, service manager’s time
From Curry: space, computers, time of tech support
Services: contract and room management
IES room
Library Data Commons@Currry
Snapshot: the University of Virginia Library Data Commons@Curry Room
IES Room
Size
ca. 450 square feet
ca. 36 square feet
Lock
off master key
off master key
Windows
only facing Durant’s office
---
Office desks
3
1
Office chairs
3
1
Physical storage
locked drawers on desks (3 total) for physical media, external storage devices, and printed materials
locked lateral filing cabinet for physical media, external storage devices, and printed materials
Other
small side table; 4 stacking chairs
1 stacking chair
3 desktops
1 desktop
Networked
2 in order to retrieve data (a) on Libraries’-managed server, (b) using VMWare client to connect to virtual machine, e.g., ICPSR Virtual Data Enclave
0
Stand-alone
1
1
Software
SAS, SPSS, Stata
SAS, SPSS, Stata
Printer
---
---
Available/able to accommodate
Add Health, Measures of Effective Teaching Longitudinal Database, Head Start Impact Study
Institute of Education Sciences
Room
Furniture
Technology Computers
Data
What we learned
Opening even a small-scale service can make a big impact
Consider which data providers will give you the best value for your money
Be prepared for users to need their hands held
Hire experts in data security to provide services and IT support; leverage their expertise to educate campus about data security
PSU landscape
CONSIDERATIONS FOR ACCESS TO SECONDARY RESTRICTED DATA
My role: Data Archivist
Penn State Population Research Institute (PRI) 2009 – 2013
Main duty = Manage restricted data contracts
Trial by fire (aka, on-the-job training)
Services limited to only PRI affiliated researchers
Data Archivist role was unique at PSU at the time
= OVERWHELMING amount of unmet need
Use of restricted data is increasing 100
Contracts processed by the Office of Sponsored Programs per fiscal year
90
90
80 70 60 50 40
38
39
FY2012
FY2013
30 20
20
10 0
3 FY2010
FY2011
FY2014
PSU considerations Education & training Contracts processing
Highly decentralized, disparate environments and support for research & IT
Decentralized environment
IT
Research
• Currently, no central IT services for restricted data • Local IT staff assist in some colleges/departments • Many researchers left to sort out IT needs on their own
• Currently, no central system to track all restricted use data contracts • Different protocols based on discipline, types of data used • Multiple units on campus responsible for research related processing and approvals
Contract processing at Penn State
Many units, documents, processes, approvals
Contract signature
There is no single, clear contract review and approval path for researchers to follow at Penn State.
Education and Training • University Libraries + partners • Working groups • Grants
• Office of Information Security guidance
• College & departmental instruction and training
?
Discipline
Security
Research • Office of VP for Research training and education
Bringing it all together: One example
Data Use Agreement Working Group – DUA-WG
Comprised of PSU signing authority units who may process restricted data contracts
Office of Sponsored Programs
Office of Research Protections
Procurement Office
Risk Management
Office of Information Security
General Counsel
The Population Research Institute
University Libraries
Information Technology Services
Where do we go from here?
Restricted data becoming a higher priority for PSU
Conversations happening at high levels and across campus
Data Use Agreement Working Group (DUA-WG)
Libraries and PRI partnership
Future efforts…
RDS partnerships & the University Libraries FDRC & RESEARCH DATA ENCLAVES
Our first story: going big
What is the Federal Statistical Research Data Center? Access
to data
Network
of research enclaves
Community
of experts
PSU Research Institutes (Social Science Research Institute) Infra-structure Office
for interdisciplinary sponsored research
of the Vice President for Research
$850,000
over 5 years
Impact: space & personnel • Enclave (E-203 instruction room) • Converted to 8 research pods • Printers • Servers • Office space (Group study) • Administrator • Census employee • Training • IT setup
Outcomes
Why the University Libraries? Central
campus space … neutral space
Strengthen
the connection with existing data services & collections
Access Free
access for PSU [not just sponsored research]
Cost
for non-PSU
Network
of FSRDC enclaves
University Libraries: service impact
Outsourcing your service (Census)
Shared programming & promotion
Strengthen
collections
the connection with existing data services &
Expanding our partnerships: another story
Association of Population Centers Restricted Data Enclave (RDE) Experience NCES Enclave - Physical Space AD Health – Network in a computer classroom Faculty Office Model Users Affiliated faculty Demography program students Demolition! Creative Commons: Flicker https://flic.kr/p/8xucKR
What do our users need?
Doesn’t the FSRDC have all Fed data?
Why NCES & ADHealth?
Partnership needs
Interdisciplinary demands
Who are the users?
Graduate students
Non-affiliated faculty
Security requirements: physical space
Restricted Key Access
Data use at RDE only
Protect machine-readable media
Disclosure
Printouts left in room
Sent to data provider for review
Limit transportation of data
Detailed requirement for ADHealth Detailed requirement for NCES
Staff Office
Space
ADHealth Research Data Enclave -Single workstation NCES Research Data Enclave -four workstations
Security requirements: computer
Limit data access only to users who obtained permission from provider
Use data on non-networked desktop computers
Restrict copying of data
Limit back-ups one copy of data
Detailed requirement for ADHealth Detailed requirement for NCES
RDS: University Libraries Contract (manager or consultant)?
Serves all faculty and students
Part of an existing position (see figure)
M.S. in statistics
Services
Contract consultation
Limited analytical consultation
Statistical Specialist (UL/Data Learning Center) Statistical Consulting Center (Math Department) RDE Contracts (UL) … new (.25 FTE)
Population Research Institute (PRI): contract manager Serves
PRI’s affiliate faculty
Manages Consultant
contracts
for University Libraries
IT support
University libraries
The need (.25 FTE)
Services a single workstation in the NCES Enclave
Services ADHealth Enclave
Position stability
Part of existing full-time IT Staff
NOT a clever graduate student
Population Research Institute Service 3 workstation in the NCES Enclave Provide consultation services for the University Libraries
Looking ahead
Space for networked solution models
Integrated analytical service for restricted data research
Statistical
GIS
Data Science
Curation of restricted data is pointless w/o solutions for access
Restricted Data Enclave for PSU Research?