GeoTiff Reader for the GeoTrellis Project through the Eclipse Foundation Johan Stenberg Royal Institute of Technology, Stockholm, Sweden

I.

Introduction

GeoTrellis is a general framework for low-latency geospatial data processing developed using Scala and Akka. The goal of the project is to transform user interaction with geospatial data by bringing the power of geospatial analysis to real time, interactive web applications. GeoTiff is a public domain metadata standard which allows georeferencing information to be embedded within a TIFF file. The potential additional information includes map projection, coordinate systems, ellipsoids, datums, and everything else necessary to establish the exact spatial reference for the file. Currently GeoTrellis uses GeoTools to read their GeoTiff data. GeoTrellis can write GeoTiff data natively. The GeoTools dependency is quite large according to the maintainers of GeoTrellis. Instead of using GeoTools, GeoTrellis has put it up on Google Summer of Code 2014 to address this issue with a new, fast, Scala module for reading GeoTiff data. GeoTrellis is a part of the Eclipse foundation.

II.

My Skills and Experience

My name is Johan Stenberg and I’m done with my bachelor’s degree in Computer Science at the Royal Institute of Technology in Stockholm, Sweden to this summer. I’m pursuing a master’s degree in Computer Science at the Royal Institute of Technology. The project should be written as a module in the Scala language. I have some experience with Scala but my language of choice is Java and I have a lot of experience using it. An example is the official Android app for Söderberg & Partners which can be found on Google Play. I have also,

1

under the summer of 2012, worked as a Java developer at Hitude in Lund, Sweden where I worked with GAE and GWT. My CV can be found among the references. The project description empathizes the word fast in their description. I have excellent grades in all Algorithm/Complexity theory subjects and I am very confident that I can develop the module to be as fast as required. I have some experience with API design and I think I could use that experience to write a good, fast and stable library for the reading of GeoTiff data files.

III.

Motivation

I have zero experience with open source and I wish to learn more about the Eclipse community and the GeoTrellis community. I have great admiration for Eclipse and I would very much like to contribute to their GeoTrellis project and make it even better than what it already is today. I also want to enhance my Scala skills and my library design skills. I look forward to working with the developers of GeoTrellis and become an even better programmer! When browsing through the GeoTrellis website I stumbled upon this: “We also believe in creating beautiful code, and that programming is a joy that can be increased with well-developed libraries.” This makes me really motivated to this project and I would like to learn more and become a better developer after this Google Summer of Code season!

IV.

Project Description

GeoTrellis is a high performance geoprocessing engine and programming toolkit. It’s written in Scala and uses modern libraries/modules sucha Akka, Spray, Sphinx etc. A needed functionality for GeoTrellis is to be able to handle GeoTiff files. GeoTiff files are Tiff raster files with geographical information as well as regular Tiff information. Today GeoTrellis can write GeoTiff files but uses a large dependency (GeoTools) to read them. The project I propose is to implement a library for loading GeoTiff files into GeoTrellis Raster data types. The library will be focusing on I/O speed and processing speed. With all algorithms using extensive I/O the optimal time complexity is O(n), where n is the file size. 2

The real issue with GeoTiff files are that the standard varies. There is one standard at the official GeoTiff website (listed in the references below) but according to Erik Osheim whom I talked to on the GeoTrellis mailing list ESRI (a company devoted to geographical technology), have their own twirks to the standard which makes implementing a fully compliant reading library hard. So a part of the project will include to design a correct specification which should be implemented. A solution to this problem is to take a look at the already implemented GeoTiff writer which handles all the twirks mentioned above. The library will also be implemented with keen design in mind, so that other developers easily can use the library and documentation will be in focus, all for the ease of the user.

V.

Project Design Description

The most difficult implementation-specific factor in this project is supporting the full GeoTiff standard including all twirks. This means that research has to be done, but the GeoTrellis writer already has support for these twirks. And later on another challenge is to quickly load the file into GeoTrellis Raster types. As mentioned above linear speed in input data size is what is desired. To achieve fast performance it will require some thought to handle I/O performance correctly. It is important that the library is quick and doesn’t act as a bottle-neck for all of GeoTrellis other features. It’s always frustrating with bad I/O performance. A typical API interface would be to be able to specify which file to read from the system, deciding if the API should use the same thread that the call came from or if there should be a callback when done. When the reading is done the call should return the GeoTrellis Raster data type(s) that has been read. When trying to support a full specification including twirks I will specify a unit testing suite using ScalaTest (link in references). This will require some thought but using a test driven approach to this project is the right way to do it. Then when the tool is ready maybe modifications to the standard will occur, and other developers or myself can, after GSOC, just tweak the tests and encorperate the changes. This will also act as a proof that the full specification is implemented, giving control to the developers and avoiding time-consuming mistakes.

3

My required deliverables include that the library is fast and correct, but also that the test suite is implemented correctly and that that the documentation is excellent, so that other developers easily can use the library without having any struggles. Some GeoTiff files has multiple bands, meaning that they have more than one actual layer in their file. This is considered a bonus deliverable, to being able to read more than one layer.

VI.

Benefits for the GeoTrellis Open Source Project

GeoTrellis is an open source organization. This means that most commonly developers receive no monetary compensation for their time. Therefore having a large bulk of work done by a GSOC student means that developers can focus on implementing new exciting features making GeoTrellis a better framework, instead of using their valuable time on writing a GeoTiff reader. • GeoTrellis loses a big dependency, which makes the framework more modular and easy to use. It also prevents GeoTrellis to grow too big and instead of GeoTrellis depending on a third-part library GeoTrellis can use it’s own code and modify it as they see needed if e.g. maybe GeoTiff files start incorperating another standard etc. • The core developers of GeoTrellis can focus on developing new exciting features, instead of something that is rather monotonic and time consuming. The project fits better for a GSOC student than for an experienced GeoTrellis core developer. • The users of GeoTrellis will benefit of a better tool, making the reading faster and losing a dependancy, the user experience will increase and then also the user base will be larger, leading to more developers helping out and making a better framework.

VII.

Required Deliverables

• A Scala based library module for reading GeoTiff files quickly, stable and which incorporates the full GeoTiff standard specification including the twirks mentioned above. • Excellent documentation and API interface for the developers that will use the GeoTiff reader module.

4

• Test suite which assures the functionality of the library module.

VIII.

Bonus Deliverables

• Support multithreading with multiband raster GeoTiffs.

IX.

Project Time Line

This is an early draft of the project time line which, if successful, fulfills the required deliverables. • Up to the 19th of May 2014: Get familiar with the codebase, learn more about GeoTrellis, their codebase and the GeoTiff specification including the ESRI twirks. • Have a beta-version working with the full GeoTiff specification implemented, and the test suite which includes all standards implemented using ScalaTest. • 11th of August 2014: Have a production-stable library ready for reading GeoTiff files with GeoTrellis. Also the library should be fully documented and tested. • 18th of August 2014: "Pencils Down".

X.

Summary

I’m the perfect candidate for this project since I have a lot of previous experience with challenging problems and the project clearly specifies a request that the module is fast, something that can be understandable since Tiff files can be up to around 4 GB (excluding BigTiff). I’m sure that the GeoTrellis community will benefit enormously if the required deliverables are fullfilled. The GeoTrellis will lose one large and bulky dependency, being one step closer to the perfect tool. Also developers will benefit since GeoTrellis will have a new great, fast API to use and and users will benefit since GeoTrellis will become faster.

XI.

Resources

• My CV: https://dl.dropboxusercontent.com/u/42266515/JohanStenbergCV.pdf.

5

• Initial project description by the Eclipse Foundation can be found here: http://wiki.eclipse.org/Google_Summer_of_Code_2014_Ideas#GeoTrellis:_GeoTiff_ Reader. • GeoTrellis website: http://geotrellis.io/. • GeoTiff website: http://trac.osgeo.org/geotiff/. • GeoTools website: http://www.geotools.org/. • GeoTiff specification: http://www.remotesensing.org/geotiff/spec/geotiffhome.html. • ScalaTest website: http://www.scalatest.org/

XII.

Contact Details

1. Cellphone including Swedish country code: +46702504823 2. Email: [email protected]

6

GeoTiff Reader for the GeoTrellis Project through the ... -

May 19, 2014 - GeoTiff is a public domain metadata standard which allows georeferencing information to be embedded within a TIFF ... I have some experience with API design and I think I could use that experience to write a good, fast and stable library for .... Cellphone including Swedish country code: +46702504823. 2.

144KB Sizes 0 Downloads 85 Views

Recommend Documents

GeoTiff Reader for the GeoTrellis Project through the ... -
May 19, 2014 - bringing the power of geospatial analysis to real time, interactive web applications. GeoTiff is a public ... I have some experience with API design and I think I could use that experience to write a good, fast and stable ... When brow

QR Reader Java Project - GitHub
QR Reader Java Project. Date: 4 July ... Open eclipse and click on File -> import -> General -> Existing projects into workspace as shown in fig. 2. ... Similarly fix problem of JRE system library if in ... ProjectFlow.pdf show complete flow of proje

pdf-1453\the-machinima-reader-the-machinima-reader ...
pdf-1453\the-machinima-reader-the-machinima-reader-.pdf. pdf-1453\the-machinima-reader-the-machinima-reader-.pdf. Open. Extract. Open with. Sign In.

pdf-1453\the-machinima-reader-the-machinima-reader ...
There was a problem previewing this document. Retrying... Download. Connect more apps... Try one of the apps below to open or edit this item.

The reader
motive. In traditional societies, shame is understood to be the most insistent of all motives. The Japanese dread of shaming the family comes to mind. In this passage several hundred years ago in pre-modern France, Jean-Jacques Rousseau described in

The reader
shame in men. When Michael visits Hanna in prison, he is virtually paralyzed by his emotions. ... in prison. Both book and film show that she learns to ... in the trial. After her death, Michael tries to overcome his shame by flying to New York to.

French Grandes Hommes for Through the Mud and the Blood.pdf ...
in the Champagne, a desperate rear-guard action close to the Marne, a midnight raid in the Somme valley or even a. bigger moment of military history. Page 1 ...

Discover the Perfect Career for You Through the ...
to online work and study. and the ascendancy of mobile communication. so ... You Love 2017-2018 Edition (UK Professional Business Management / Business).

French Grandes Hommes for Through the Mud and the Blood.pdf ...
Whoops! There was a problem loading more pages. Whoops! There was a problem previewing this document. Retrying... Download. Connect more apps... Try one of the apps below to open or edit this item. French Grandes Hommes for Through the Mud and the Bl

pdf-14100\china-through-the-stereoscope-a-journey-through-the ...
... the apps below to open or edit this item. pdf-14100\china-through-the-stereoscope-a-journey-thro ... -of-the-boxer-uprising-facsimile-by-james-ricalton.pdf.