ABHISHEK VERMA 470, 16th St, NW Apt#2017 Atlanta, GA 30363
ph: (425) 802-3396 email:
[email protected] A/S/N: 27/Male/Indian
OBJECTIVE Seeking a full time opportunity in a reputed organization to work on challenging software design and development projects. EDUCATION Georgia Institute of Technology Master of Science, Computer Science GPA: 3.9/4.0
Atlanta, GA Aug 2009-May 2011
Birla Institute of Technology and Science Bachelor of Engineering (Hons.), Computer Science GPA: 9.17/10.0
Pilani, India Aug 2001-Jun 2006
INDUSTRY EXPERIENCE Microsoft Research Redmond, WA Research Intern, Extreme Computing Group May 2010 - Aug 2010 • Developed a minifilter driver module to monitor the IO buffers in Windows NT kernel and retrieve stats to enable content fingerprinting of virtual machines during boot and in VDI scenarios. • Performance evaluation of IO monitoring vs. memory introspection as a technique to analyze memory consolidation opportunities in co-hosted virtual machines for boot and VDI scenarios. Microsoft India Development Center Hyderabad, India Software Development Engineer in Test, Structured Data Search Group May 2008 - Jun 2009 • Design and development of document processing pipelines to crawl structured content like MSN commerce feed data from partners (CNET, Etilize) used by product search, RSS content (blogs) for the MSN travel feed etc. Digibee Microsystems Bangalore, India Member of Technical Staff, Wireless Systems Group Oct 2007 - May 2008 • Design and development of UMTS non-access stratum protocols, namely Call Control and Mobility Management. Freescale Semiconductor Bangalore, India Member of Technical Staff - IC1, Wireless and Mobility Systems Group July 2006 - Sep 2007 • Performance evaluation and memory optimizations to enable HSDPA data rates on Freescale’s MXC platform. RESEARCH EXPERIENCE Georgia Institute of Technology Atlanta, GA Research Assistant, Center for Experimental Research in Computer Systems (CERCS) Jan 2010 - current • Working on the Shadowfax project which defines a notion of dynamically composed GPGPU assemblies to enable CUDA applications to scale to high-performance clusters. The application resides in a Xen virtual machine and is agnostic of the assembly abstraction built dynamically by Shadowfax runtime, thus easing programmability and portability. We characterize HPC and enterprise workloads to best match them with GPGPUs available in the cluster. Techniques such as API interposition, function marshalling and request batching optimizations enable global scheduling policies, admission control and dynamic retargeting of execution streams. (Check publications.)
PUBLICATIONS • Shadowfax: Dynamically composed GPGPU assemblies, Alexander Merritt, Vishakha Gupta, Abhishek Verma, Ada Gavrilovska, Karsten Schwan, 5th Workshop on Virtualization Technologies in Distributed Computing (VTDC 2011), San Jose, USA, June 2011 PROJECTS GPGPU platform for CUDA applications on high-performance clusters • Design, implementation and performance evaluation of api remoting extensions and request batching optimizations for GViM, a GPGPU virtualization framework, to enable CUDA applications to scale to a cluster of GPGPU nodes. • Enable multi-threading in virtualized gpu driver to support multiple CUDA contexts per Xen guest (ongoing) Page sharing analysis in windows virtual machines for boot and VDI scenarios • Performance evaluation of IO buffer monitoring vs. memory introspection techniques for page sharing in terms of overheads, file-extension wise break-up of sharing yields and suitability in environments with high churn and no common static content among VMs. Evaluation of VM fork primitive for rapid and stateful replication of Xen virtual machines • Evaluated virtual machine cloning performance of Snowflock framework for MPI based parallel programming for a couple of compute intensive benchmarks. Parallelization of Delta Stepping algorithm on GPUs • Implementation, evaluation and tuning of delta stepping algorithm, for finding single source shortest paths in very large graphs, using Nvidia’s CUDA framework. Betweenness Centrality algorithm in a distributed environment • Implementation of Betweenness Centrality algorithm proposed by Bader et. al. in Map Reduce distributed programming model. Authors had a shared memory implementation on Cray XT4. Parallelization of Association Rule Mining algorithm in a distributed environment • Implementation of a data-parallel multipass inverted hashing and pruning algorithm, PMIHP, in MPI to enable association rule mining in large datasets. Dynamic instruction scheduling in superscalar processors • Simulation and trace-based performance evaluation of a superscalar pipeline that uses Tomasulo’s instruction scheduling algorithm to support out-of-order execution. SKILLS • Programming: C/C++, Core Java, x86 assembly, Verilog, CUDA, MPI, OpenMP, Cilk++ • Scripting: Unix (bash, csh), MS-DOS, Perl • Databases: Oracle (PL/SQL), SQL Server • Networking: TCP/IP, 3GPP • Platforms: Linux, Windows, Xen ACADEMIC HONORS • Certificate of Merit in Mathematics and Physics, awarded by Central Board of Secondary Education, for being among the top 0.1% of the successful candidates in All India Senior School Certificate Exam, AISSCE 2001. • Certificate of Merit in Mathematics, awarded by Central Board of Secondary Education, for being among the top 0.1% of the successful candidates in All India Secondary School Exam, AISSE 1999. • Certificate of Appreciation by Department of Secondary and Higher Education, Govt. of India for academic excellence in AISSCE 2001. • Certificate of Merit under the CBSE National Scholarship Scheme for academic excellence in AISSE 1999. REFERENCES • Available on request.