Skip to content

Latest commit

 

History

History
468 lines (325 loc) · 36.9 KB

README.md

File metadata and controls

468 lines (325 loc) · 36.9 KB

Urban Data Science & Smart Cities

URSP688Y
Spring 2024
Urban Studies and Planning
School of Architecture, Planning, and Preservation
University of Maryland, College Park

Instructor
Chester Harvey
National Center for Smart Growth
[email protected]

This repository contains files and other course content for URSP688Y, Urban Data Science and Smart Cities, in Spring 2024. It will be updated regularly throughout the semester. Official announcements, readings, and grades will be handled on ELMS-Canvas.

Students should submit all assignments as pull requests to this repository. Any submitted materials will be public.

Quick Links & Overview

Week Topic Format Discussion Leader Demo Exercise
Week 1: Jan 29 Course Introduction, Programming Fundamentals In-Person -
Week 2: Feb 5 More Programming Fundamentals Zoom Jesse
Week 3: Feb 12 Tabular Analysis Zoom Kayla, Fahmi
Week 4: Feb 19 Debugging and Working with Files Zoom Jona
Week 5: Feb 26 Basic Data Visualization In-Person Alanna, Mimi
Week 6: Mar 4 Accessing (and Wrangling) Data from the Web Zoom Salma
Week 7: Mar 11 Geospatial Data Zoom Emma, Katy
Spring Break: Mar 18 🏄 NO CLASS - - -
Week 8: Mar 25 Final Project Proposal, Set up Miniconda and JupyterLab (optional) NO CLASS - - Final Project: Short Proposals Due
Week 9: Apr 1 Geospatial Data (cont.) Zoom Thomas, John
Week 10: Apr 8 Network Analysis Zoom Tayo, Sururah
Week 11: Apr 15 Spatial Visualization In-Person Saiful
Week 12: Apr 22 BI Software Zoom Homayoon
Week 13: Apr 29 Machine Learning Zoom Ebenezer
Week 14: May 6 Final Project Presentations In-Person - - Final Project: Slides Due
Week -1: May 13 Final Project NO CLASS - - Final Project: Final Product Due

Technology

Zoom Room

Websites

GitHub (Coding demos and exercises)

ELMS-Canvas (Readings, grades, and course communications)

Equipment

Please bring a laptop with Wi-Fi connectivity and an updated internet browser to every in-person class. For Zoom classes, please use a computer with a webcam (see “Zoom Guidelines,” below) and be prepared to work on your browser concurrent with my shared screen and camera feeds from classmates. This may take some creative arranging of windows. It may be helpful to have multiple monitors.

If you have challenges accessing appropriate technology, please don’t assume it’s an impediment to taking this class. Let me know and I will try to help.

Course Description

Novel data and computational tools are reshaping planning, development, operation, and understanding of urban systems. These may enable more efficient and equitable distribution of resources, but may also reproduce injustices and divert attention away from more straightforward solutions. This course will introduce students to basic tools and applications for data science to examine urban systems while also challenging them to critique the role of technology in improving cities. What are data science's strengths and weaknesses? Where does it belong (or not) in our planning toolkits? How have planners and technologists appropriately espoused the capabilities of data science and smart cities? And how have these technologies failed to live up to their advertised capabilities? What questions can big data answer, and what issues does it raise? These overarching questions will guide parallel technical and theoretical threads throughout the semester.

The technical thread will use coding demos and short exercises to introduce students to programming logic and Python for urban data science. Demos will be aimed at beginners: students who have never coded before. Exercises will give beginners an opportunity to practice new skills, and more advanced coders an opportunity to stretch their capabilities in urban applications they may not have previously encountered.

The theoretical thread will use reading seminars to examinine the emergence, capabilities, and limitations of smart cities, big data, and urban data science.

The threads will converge on a final project that asks students to design a data scientific approach to address a contemporary planning issue and critique its capabilities and limitations. Students may either implement their project in code or write a detailed proposal for it.

This course will prepare you to:

  1. Use programming logic to address analytical questions, and a Python-based toolkit to implement analyses and share them reproducibly with others
  2. Understand the technical and ethical limitations of digital technologies in urban contexts
  3. Be professionally conversant with urban technologies, either as a coder or someone who is able to collaborate in design, implementation, and interpretation of urban analyses

Components

Coding Demos

Each class will include an interactive coding demo, which I will walk through while answering questions and providing additional context on tools, techniques, related theory, and applications in planning. Demo notebooks will be provided for you to run on your own, either during or after class. But I will also present other content ad-hoc, so attending the demos will be essential. They will introduce the basic tools you need for each exercise.

Demos will have some planning and structure, but they will also be a time to roll up our sleeves, experiment, program collaboratively, and Google for help together. They will show how programming is sometimes (or often) messy and frustrating, but with a little grit and ingenuity you can usually find a solution. They will be a welcoming place for beginner coders to embrace uncertainty, ask what might feel like dumb questions, and recognize that everyone else is probably facing the same problems.

Reading Seminars

The other main component of each class will be a discussion-based reading seminar. You should read chapters and articles listed on the schedule before each class and be prepared to discuss them. Readings demonstrate the use of urban data science techniques in research and discuss theoretical issues around the use of urban data science and smart cities technologies. They are meant to show both best practices and opportunities for critique and improvement.

Everyone will be assigned to lead discussion for at least one class session. Discussion leaders should be prepared to pose questions to classmates and keep discussion reasonably focused on the topics covered by that day's reading. Leaders should not give a presentation summarizing the reading for that day. The reading load is designed to be manageable so that everyone can be prepared to discuss all readings.

Exercises (20% of grade)

Short weekly exercises will give you opportunities to try out techniques demonstrated in the coding demos. Each will be structured around a question that you should be able to address using the tools you have learned so-far. You are encouraged to outline solutions with pseudocode. A clear and reasonably detailed approach in pseudocode will get 90% credit, just as much as sloppy code with a clean result.

A "clean result" means that I can rerun your code and arrive at the same number, table, figure, or other final output that you did, and that this output addresses the question posed by the exercise. There will rarely be entirely right or wrong answers.

Exercises will be assigned in each class and are due before the next class.

Each exercise will have a Colab/Jupyter notebook available on the course GitHub repository. Please use this standard GitHub workflow to submit each exercise (we'll go over this in class):

  1. Fork the course GitHub repo to your own account (you'll probably only have to do this once at the beginning of the course)
  2. Make a copy of the notebook in your own fork with you last name as an underscored suffix (e.g., exercise01_harvey.ipynb)
  3. Complete the exercise in your copy of the notebook
  4. Commit your notebook (and any other necessary files) to your fork
  5. Make a pull request of your fork to the course repo

Please note that all pull requests to course repo will be publicly viewable.

Exercises will be graded out of 10 points based on this rubric:

  • 0: nothing handed in
  • 6: sloppy or illogical code or pseudocode; no clean result
  • 9: readable and logical code or pseudocode; no clean result
  • 9: sloppy code; clean result
  • 10: readable and logical code or pseudocode; clean result
  • 11: wow

Free pass: Your lowest exercise grade (including a 0 if you just don't turn one in) will be dropped.

Submissions more than one week late will not be accepted.

Final Project (50% of grade)

The course will culminate with a final project, delivered in three stages—a short proposal (10%), presentation (10%), and final product (30%)—that will give you a chance to practice using data scientific approaches to address a real-world planning problem. You may develop a final project independently or with one partner.

The project asks you to address a request from an imaginary planning agency for analysis of a potential equity gap within their city or region. You get to choose which region you are working for and the question you are asking about equitable outcomes. In designing your analysis and discussing the results, the agency would also like you to consider the theoretical strengths and weaknesses of using data scientific and smart cities approaches to examine equity and address inequities. Can you design an analysis that appropriately balances opportunities and concerns?

Project Forms

Projects may take two forms: (1) a functioning analysis with input data and a codebase that yield reproducible results, or (2) a proposal for an analysis that could reasonably be executed.

Functional Analysis

You will develop a well-documented repository of data and code, along with a short accompanying narrative describing the project's motivation, central question, approach, results, and discussion of their meaning. The narrative may be either a traditional paper or a customized webpage/site/app that combines text and graphics. Narratives for this option are expected to be 1,000–1,500 words (2–3 pages, single spaced).

Proposal

You will write a paper with sections similar to the narrative described above, but with considerably more detail about the proposed approach. In lieu of conducting the analysis, you must convincingly portray how it will be conducted (when it is funded, of course), including proposed data sources and tools. The proposal should also include expanded discussion of smart cities theory to support and critique how your approach relates to equity, both in the substantive question it addresses and opportunities or issues it raises methodologically. This will likely draw on literature outside of what is assigned for the course. Narratives for this option are expected to be 4,000–6,500 words (8–12 pages, single spaced).

Convincing proposals are crucial in both research and practice. Imagine you are applying for a grant to fund a research project or responding to an RFP issued by a public agency. In both cases, you need to convince the reader that your approach is actionable. It should also be intelligible for a non-expert audience. Writing with this combination precision and clarity is a valuable skill to hone.

Project Components

Short Proposal (Due March 25th; 500–1,000 words; 10% of course grade)

Write a short proposal with the following sections outlining your proposed project:

  1. Introduction: What is the issue you are trying to address? Why is addressing it important? How will data science/smart cities tools help you address it?
  2. Brief Background (not a full literature review): Situate the problem in the context of other work, either in practice or research. Is the place you're working in already doing something related?
  3. Objectives: What specific question will your analysis answer (narrower than the issue described in the introduction)? Are you testing a hypothesis, or looking to develop theory in a spaces that lacks it?
  4. Methods: What data will you need? What methods will you use to analyze these data?
  5. Limitations and Ethics: How will your data and methods limit your conclusions? What ethical challenges may be posed by your analysis? What are the theoretical strengths and weaknesses of using your approach?
  6. References (not included in word count): Bibliographic list of any cited works, datasets, or other materials.

Please submit on ELMS as a PDF.

Presentation (In-Class April 29th; 8 minutes; 10% of course grade)

Please focus your presentation on these areas:

  1. Problem statement and central question
  2. Data and methods used to address
  3. What do the conclusions mean? (not just the results, but their implications)
  4. What challenges and limitations are posed by using your approach? How did you address any of these challenges?

In addition to your in-class presentation, please submit slides on ELMS as a PDF.

Final Product (Due May 6th; see above for recommended lengths and formats; 30% of course grade)

Please submit on ELMS as a PDF with links to online content, as appropriate.

Participation (30% of grade)

This is a hands-on and discussion-oriented course with only 14 scheduled meeting times. More than one unexcused absence will affect your learning and participation grade. Please email me ahead of class, ideally by at least several hours, to let me know if you will be unable to join due to illness or other excusable reasons listed in the graduate course-related policies. Your participation grade for the semester will reflect both your leading of discussion in an assigned reading seminar (10%) and general engagement with class activities (20%).

Required Reading

We will read Jennifer Clark's book on smart cities, Uneven Innovation, from cover to cover. You may want to buy a hard copy, or you can read the ebook for free through the UMD library. All other readings are either available at links listed in the schedule or will be available on ELMS-Canvas.

Clark, J. (2020). Uneven Innovation: The Work of Smart Cities. Columbia University Press. (UMD Link)

Professional Communication

Please use this course as an opportunity to practice professional communication with me and your student colleagues. Follow professional etiquette in email correspondence. Grammarly has an excellent guide. Please call me “Chester”; note that other professors may prefer more formal titles. Introduce your preferred pronouns—mine are he/him/his—and refer to others by their preferred pronouns.

I typically read and send email during regular business hours: 9am to 5pm ET on weekdays. I aim to address time-sensitive email within one business day, but may take longer. If you send me an email on Friday, I may not get back to you until the next week. Please plan ahead and don’t expect an immediate response.

Zoom Guidelines­­­—'Cameras On' Norm

On Zoom, we will have a 'cameras on' norm to promote focus and collegiality. Many of us will have unideal environments or technological hiccups. This is okay. However, please do your best to minimize distractions for you and your classmates.

Code Readability

A key aspect of writing good code is readability: can you and others quickly and easily understand it? In many cases, code will run (readable to the machine) without being readable for a human. But sloppy, unreadable code won't be as useful for debugging, recycling, and documenting your process. I will do my best to demonstrate readable code in demos and exercise notebooks. You will need to write readable code to get full points on exercises. And you'll be much more appealing as a prospective collaborator or hire if you write readable code.

CS61A has an excellent composition guide. PEP 8 is a standard Python style guide. Google publishes their internal Python style guide. Any of these, or reasonable variants of them, will help your code look clean, professional, and readable. When in doubt, just try to be consistent and use good judgement.

Online Resources, Academic Integrity, and Troubleshooting

Coders sometimes feel like professional Googlers. There are a lot of code snippets on sites like StackOverflow. Working collaboratively is extremely useful for ideation and troubleshooting. This is an excellent debugging guide from CS61A. And generative AIs, like ChatGTP, are an increasingly important tool. One goal of this course is to introduce you to these sources for help and give you practice using rather than abusing them. This is both practical and ethical. You might write workable code simply by copying and pasting snippets from a forum or asking ChatGTP to write you a script. But this is unlikely to yield beautiful, reliable, and efficient code. Part of your job is to use resources intelligently, gut-check sources (is ChatGTP doing what I wanted?), customize examples to your purpose, and be a creative and ethical backstop to the availability of sloppy and poorly credited internet shortcuts.

Here are my suggestions:

  • Be a compulsive Googler. Ask ChatGTP what's wrong with your code. Ask your friends, too. Use these resources to learn and make your coding better, not to avoid learning. If I suspect you have merely copied and pasted code for an exercise, I will call you out on it. If I can't tell, good on the AI, but bad for your learning as a coder. Why bother taking this class?
  • If you copy and paste a major block of code, cite the source in a comment. I often paste StackOverflow URLs into docstrings for basic functions that I've borrowed or adapted from posts, both to show others where I got them, and remind myself in case I want to go back to the source later. This is both practical and ethical.
  • If you're in a Googling death spiral and just can't find an answer, first ask a classmate—there's a good chance you've confronted the same issue—then ask me.
  • You are encouraged to work on exercises with a classmate. You're even welcome to submit the same code for exercises. Just note at the top of your submission who you worked with.

Grading

Final letter grades will be assigned based on these ranges: 90-100%: A, 80-89%: B, 70-79%: C, 60-69%: D, 0-59%: F. Letters may be augmented by + or – at the high or low end of each range.

Accessibility and Disability Services

The University of Maryland is committed to creating and maintaining a welcoming and inclusive educational, working, and living environment for people of all abilities. The University of Maryland is also committed to the principle that no qualified individual with a disability shall, on the basis of disability, be excluded from participation in or be denied the benefits of the services, programs, or activities of the University, or be subjected to discrimination. The Accessibility & Disability Service (ADS) provides reasonable accommodations to qualified individuals to provide equal access to services, programs and activities. ADS cannot assist retroactively, so it is generally best to request accommodations several weeks before the semester begins or as soon as a disability becomes known. Any student who needs accommodations should contact me as soon as possible so that I have sufficient time to make arrangements.

For assistance in obtaining an accommodation, contact Accessibility and Disability Service at 301-314-7682, or email them at [email protected]. Information about sharing your accommodations with instructors, note taking assistance and more is available from the Counseling Center.

Notice of Mandatory Reporting

Notice of mandatory reporting of sexual assault, sexual harassment, interpersonal violence, and stalking: As a faculty member, I am designated as a "Responsible University Employee," and I must report all disclosures of sexual assault, sexual harassment, interpersonal violence, and stalking to UMD's Title IX Coordinator per University Policy on Sexual Harassment and Other Sexual Misconduct.

If you wish to speak with someone confidentially, please contact one of UMD's confidential resources, such as CARE to Stop Violence (located on the Ground Floor of the Health Center) at 301-741-3442 or the Counseling Center (located at the Shoemaker Building) at 301-314-7651.

You may also seek assistance or supportive measures from UMD's Title IX Coordinator, Angela Nastase, by calling 301-405-1142 or emailing [email protected].

To view further information on the above, please visit the Office of Civil Rights and Sexual Misconduct's website at ocrsm.umd.edu.

Other University Policies

Please see UMD's website for graduate course-related policies.

Useful References

Other Courses

Structure and Interpretation of Computer Programs (CS61A), UC Berkeley. https://cs61a.org/

Introduction to Computer Science (CS50), Harvard. https://www.edx.org/cs50

Introduction to Data Science (CMSC320), UMD. https://cmsc320.github.io/

Courses listed below under "Acknowledgements"

Books

Adhikari, A., DeNero, J., Wagner, D. (2022) Computational and Inferential Thinking: The Foundations of Data Science, 2nd Edition. https://inferentialthinking.com (Originally developed as the textbook for Data 8: Foundations of Data Science, UC Berkeley)

Downey, A. B. (2012). Think Python: How to Think Like a Computer Scientist - 2e. Green Tea Press. https://greenteapress.com/wp/think-python-2e/

Lloyd, C. D. (2010). Spatial Data Analysis: An Introduction for GIS Users. Oxford University Press. (UMD Link)

Rey, S., Arribas-Bel, D., & Wolf, L. J. (2023). Geographic Data Science with Python. CRC Press. https://geographicdata.science/book/intro.html

Singleton, A. D., Spielman, S., & Folch, D. (2018). Urban Analytics. SAGE Publications Ltd.

Websites

Urban Informatics and Visualization Course Wiki (UC Berkeley)

Software Carpentry (Scientific computing tutorials)

Real Python (Python tutorials, monthly subscription with free trial)

datadamp (Coding tutorials, monthly subscription with free trial)

LinkedIn Learning (Software tutorials, free for UMD students)

Acknowledgements

This course is inspired by numerous other courses and colleagues, especially:

Schedule

January 29th (Week 1)

  • Course introduction
    • Why data science?
    • Why urban data science?
    • Opportunities and challenges
    • Plan for the semester
  • Programming fundamentals
    • Colab notebooks
    • Intro to programming (with Python)
      • Why Python?
      • Variables
      • Syntax vs. style
      • Basic data types
      • Programming logic
        • Conditions
        • Loops
      • Errors and debugging
  • Pseudocode
  • GitHub
    • Intro
    • Submitting exercises

February 5th (Week 2)

  • More programming fundamentals
    • More intro to Python
      • Basic data types
      • Programming logic
        • Conditions
        • Loops
      • Functions
        • Namespaces
      • Classes and methods
      • Goodies
        • Conditional expressions
        • List comprehensions
        • Recursion
      • Packages
      • Errors and debugging
      • Building and troubleshooting
        • Documentation
        • Google
        • Generative AI
    • Pseudocode
  • Reading seminar
    • Somers, J. (2023). Begin End: A coder on the waning days of the craft. New Yorker, 99(38), 14-18. (Direct Link) (UMD Link)
    • [Chapter 1: "Uneven Innovation: The Evolution of the Urban Technology Project" (pp. 1–30)] Clark, J. (2020). Uneven Innovation… (UMD Link)

February 12th (Week 3)

  • Importing packages
  • Documentation
  • Classes
  • Table structure
    • Tidy data
  • Pandas
    • DataFrames
    • Calculations with columns
    • Series
    • Previewing
    • Selection and filtering
    • Grouping
    • Apply
    • Converting wide to long
  • Using a debugger
  • Numpy
  • CSVs
  • Parquet
  • Databases
  • Reading seminar
    • [Chapter 2: "Smart Cities as Solutions" (pp. 31–56)] [Chapter 3: "Smart Cities as Emerging Markets" (pp. 57–94)] Clark, J. (2020). Uneven Innovation… (UMD Link)

February 19th (Week 4)

  • Debugging and Working with Files
    • Using a debugger (and installing packages with pip)
    • Connecting to Google Drive in Colab
    • Loading data from a file
    • Tabular joining
    • Saving data to a file
    • Loading code from a file (module)
    • Repository structure
  • Introducing the final project
  • Reading seminar
    • Wilson, G., Aruliah, D. A., Brown, C. T., Hong, N. P. C., Davis, M., Guy, R. T., Haddock, S. H. D., Huff, K. D., Mitchell, I. M., Plumbley, M. D., Waugh, B., White, E. P., & Wilson, P. (2014). Best Practices for Scientific Computing. PLOS Biology, 12(1), e1001745. https://doi.org/10.1371/journal.pbio.1001745
    • [Chapter 4: "Smart Cities and the New Urban Entrepreneurship" (pp. 95–124)] Clark, J. (2020). Uneven Innovation… (UMD Link)

February 26th (Week 5)

  • Basic data visualization
    • PR demo: from notebook to merge
    • Principles of graphic communication
    • Matplotlib
    • Pandas plot method
    • Seaborn
    • Export to Illustrator
  • Reading seminar
    • [Chapter 1: "Graphical Excellence" pp. (13–51)] Tufte, E. R. (2001). The Visual Display of Quantitative Information, 2nd Ed. (2nd edition). Graphics Press.
    • [Chapter 6: "Smart Cities as Participatory Planning" (pp. 156–180)] Clark, J. (2020). Uneven Innovation… (UMD Link)

March 4th (Week 6)

  • Accessing (and Wrangling) Data from the Web
    • APIs
    • Parsing JSON
    • Messy data
    • Big data
  • Reading seminar
    • Kitchin, R. (2014). The real-time city? Big data and smart urbanism. GeoJournal, 79(1), 1–14. https://doi.org/10.1007/s10708-013-9516-8
    • [Chapter 5: "Smart Cities as Urban Innovation Networks" (pp. 125–155)] Clark, J. (2020). Uneven Innovation… (UMD Link)

March 11th (Week 7)

  • Geospatial data
    • Most geoprocessing is just high school geometry
    • Points, linestrings, and polygons
    • Coordinate systems
    • Shapely
    • Geopandas
  • Reading seminar
    • Hanna, R., Kreindler, G., & Olken, B. A. (2017). Citywide effects of high-occupancy vehicle restrictions: Evidence from "three-in-one" in Jakarta. Science, 357(6346), 89–93. https://doi.org/10.1126/science.aan2747
    • Peng, Q., Knaap, G., & Finio, N. (2023). Do Multifamily unit Rents Increase in Response to Light Rail in the Pre-service Period? International Regional Science Review, 01600176231162563. https://doi.org/10.1177/01600176231162563

March 18th (Spring Break)

NO CLASS

March 25th (Week 8)

NO CLASS

  • Short proposal for final project due
  • (Optional) Mid-week tutorial to set up Miniconda and JupyterLab on personal computers

April 1st (Week 9)

  • Geospatial data (cont.)
    • Overlap and proximity analyses
  • Reading seminar
    • [read "Introduction" (pp. 20–33) + skim the rest] Cheshire, J., & Uberti, O. (2014). London: The Information Capital. Particular Books.
    • [Chapter 7: "Smart Cities as the New Uneven Development" (pp. 181–200)] Clark, J. (2020). Uneven Innovation… (UMD Link)

April 8th (Week 10)

April 15th (Week 11)

  • Spatial visualization
    • Making good maps
    • Basemaps
    • Carto
  • Reading seminar
    • Pereira, R. H. M. (2019). Future accessibility impacts of transport policy scenarios: Equity and sensitivity to travel time thresholds for Bus Rapid Transit expansion in Rio de Janeiro. Journal of Transport Geography, 74, 321–332. https://doi.org/10.1016/j.jtrangeo.2018.12.005
    • Block, J. P., Scribner, R. A., & DeSalvo, K. B. (2004). Fast food, race/ethnicity, and income: A geographic analysis. American Journal of Preventive Medicine, 27(3), 211–217. https://doi.org/10.1016/j.amepre.2004.06.007

April 22nd (Week 12)

April 29th (Week 13)

  • Machine learning
    • Clustering
    • Sentiment analysis
  • Reading seminar
    • Kandt, J., & Batty, M. (2021). Smart cities, big data and urban policy: Towards urban analytics for the long run. Cities, 109, 102992. https://doi.org/10.1016/j.cities.2020.102992
    • [Epilogue: "The View from Inside the Urban Innovation Project" (pp. 217–230)] Clark, J. (2020). Uneven Innovation… (UMD Link)
  • Final project consultations

May 6th (Week 14)

  • Final project presentations

May 13th

NO CLASS

  • Final project due