# Interview Presentation
by Guido Stein
2022-06-14
Note:
Thank you for joining me today. This presentation shares information about me and my career in order
to give you a better picture of who I am and how I think.
# Overview
- Introduction
- Example Project
- Geospatial Input
- Tabular Input
- Workflow Design
- Conclusions
# Introduction
## Guido Stein
### [gee-doh] like Burrito
Note:
I prefer my name pronounced [gee-doh] like burrito, mosquito, and Dorrito. That said, it's not that
big a deal to me.
# Interrupt me
Note:
I want you to get everything you want out of this presentation, so please feel free to interrupt me
if you have a question, comment, or would like more details.
# Highlight Reel
- History
- Proficiencies
- Tools Used
- Accomplishments
- One More Thing
Note:
I have many qualities that make me a great candidate for Data Engineer.
Here is a highlight reel of me and my career.
# History
- B.A. in Geography from Clark University
- 20+ years working with geospatial data, workflows, and development
Note:
I have a solid foundation in using geospatial data to solve problems. I started my geospatial
learning at university and worked in the industry ever since.
# Proficiencies
- Python
- Javascript
- SQL
- Workflow Design
- Problem Solving
Note:
I have mastery and understanding of the fundamental skills needed to build technical solutions.
# Tools Used
- Node
- Jupyter Notebooks
- Postgres
- AWS
- QGIS
Note:
I have used many of the current tools, applications, and frameworks for developing solutions.
# Accomplishments
- Reputation for geospatial data domain knowledge
- Reputation as a technical problem solver
Note:
I am known for problem-solving with geospatial and other technical skills. Colleagues often reach
out to me to help them solve issues.
# One More Thing
- Co-Chair FOSS4G Boston 2017
- President OSGeo US Local Chapter
- Co-Chair FedGeoDay
Note:
I am a well rounded and valuable team member who can execute, lead, and develop esprit de corps. I
keep current with technological trends and practices by participating in the open source geosptial
community.
# Example Project
## Mapping Access
Note:
To demonstrate how I work, I will share with you this example mapping project that I worked on for
five years.
# Statement Of Work
Create a reusable solution to collect, convert, and aggregate spatial data working with a team
across two offices that report the output data multiple times a year for five years.
Note:
In short, we were collecting data about access to services across the U.S. with multiple service
providers submitting data
# Variables
- stake holders
- proprietary information
- data providers
- team members
- output formats
- reporting dates
Note:
Each variable in this project is a multiplier on the number issues the solution being built needed
to satisfy in order to be succesful
# Reporting Standards
- Tiger block polygons for urban areas
- Tiger road polylines for rural areas
Note:
Aggregated data needed to be reported in the Census Tiger format which is a common standard for U.S.
demographics. It is both a blessing and a curse to use this format because there is variations
from decade to decade because block boundaries and road networks change.
# Conflation
The merging of two or more sets of
information, texts, ideas, etc. into one.
-Google Dictionary
Note:
Combining data is tricky because you risk the chance of mis-representing the source information in
your output.
# Geospatial Input Data
- Point data representing homes with access
- Polygons representing areas with access
- Lines representing roads with access
Note:
Some of the service providers submitted spatial data in the following formats.
# Issue #1
## Road to Road Conflation
![Issue 1](./img/issue_1.png "Issue 1")
Note:
Data providers used multiple reference system geometries for the road linework. Alignment of
geometry or 1:1 relationship caused issues.
# Issue #1
## Solutions
- Buffer provided segment and then join to Tiger segments that are within buffer
- Snap provided segment end points or vertices to Tiger segments then join overlapping geometries
- Convert provided segments to center points and then join to nearest Tiger segment
# Issue #1
## Reasoning and Research
- Buffering segments does not work if new segments along a road are smaller than the potentially
associated Tiger geometry
- There is no way to set a proper tolerance for snapping to Tiger geometry
- Center points are not exact, but they can represent 1:* relationships and can assign data from new
roads to nearest segment in Tiger geometry
# Tabular Input Data
- x,y representing homes
- census tiger ids
# Issue #2
## Processing x,y points
![Issue 2](./img/issue_2.png "Issue 2")
Note:
x,y point data commonly uses decimal degree latitude and longitude coordinates and Tiger Data is in
NAD 83 feet.
# Issue #2
## Solutions
- Convert all incoming data into NAD83 before processing
- Convert all reference data into WGS83 before processing
# Issue #2
## Reasoning and Research
- Each provider data set sent will need to be converted to NAD83, making this a time sink
- Only the output data will need the original source geometry from Tiger
- None of the provider data we looked at came in as NAD83
# Workflow Design
![xkcd 974](img/the_general_problem.png "xkcd 974")
# Issue #3
## What tools should I use?
# Issue #3
## Workflow Design Goals
- modular tools
- self documenting process
- accessible for GIS Analysts
# Issue #3.1.0
## Use FME Desktop
- Develop reusable processing models quickly
- Conversion models act as documentation
- Limited licenses available
# Issue #3.2.0
## Use ArcGIS Data Interoperability extension
- Floating licenses available
- Toolset accessible through familiar ArcGIS
- Editing environment buggy and loses model updates
# Issue #3.3.0
## Use ArcGIS Model Builder
- Can build reusable models
- Models are self documenting and can be stored with processed data
- Models get too complicated and fail when using variables
# Issue #3.3.1
## Use ArcGIS Arcpy Scripts
- Can build reusable scripts
- Not self documenting when run
- Not familiar to GIS Analysts
# Issue #3.3.2
## Use ArcGIS Model Builder and Arcpy Scripts Tools
- Can build reusable scripts
- Models add self documentation and are stored with processing data
- GIS Analysts can use Model Builder to access Python scripts
# Issue #3.3.2
## Results
- Time spent processing data improved with each round by at least 25%
- Documentation of process allowed for auditing providers for QC
- Modular design allowed for new logic and processes to be added to the workflow when needed
# Conclusion
- I have a deep understanding of geospatial concepts
- I have strong skills and knowledge of multiple types of data and technology
- I enjoy developing, building, and troubleshooting data workflows
- I don't mind if you call me Gwee-Do
Note:
I hope that you take away the following from this presentation
# But why REsurety, Inc.?
- Human generated climate change is bad
- You work with tools I like to use (Python,AWS,Postgres)
- You are building and iterating solutions
- Interviewing has been joyful and makes me feel like there is a good culture fit
# ?? Questions ??
## Interview Presentation
by Guido Stein
2022-06-13