Fangcao Xu

Graduate Student, Pennsylvania State University Curriculum Vitae

About Me

I am currently a graduate student in the Geography Department at Pennsylvania State University. I joined the (GeoVISTA Center) under supervision of Prof. Donna Peuquet in Fall 2016. My research topic is "Spatiotemporal Information Diffusion and Network Inference via Sequential Pattern Mining". I am passionate about data-driven and big data analytics in Geographic Information Science, using data mining, machine learning & geo-visual analytics.

In Fall 2014, I got admitted to the University of Pennsylvania for the M.Sc. degree of Urban Spatial Analytics in the School of Design. My research foci include the data science, spatial data analytics - most notably Geographic Information System (GIS) - statistics, R and Python programming, data visualisation and web-based mapping with substantive knowledge under the urban context (e.g., urban land use planning, transportation, natural hazard simulations and predictions).

I received my B.Sc. of Surveying and Mapping from Wuhan University, a top ten in China whose geodesy ranking is No. 1 in June 2014. My bachelor thesis topic is - Reconstruction of 3D City Model based on oblique images.

Working Experience


My work in Esri China (Beijing) from 2015 to 2016:
Data Scientist in 1) designing business geographic web map prototype and user interface, 2) collecting and analyzing geographic big data, and 3) developing algorithms for the trading area analysis, site selection and route optimization.

Read More

My Research


My research at Pennsylvania State University in GeoVista:
Cyber Bullying on Twitter: A Quasi-Experimental Study of Corporate Policy Effects: To investigate Cyber Bullying on posts sent out across social media and Corporate Policy Effects, more than 5TB of twitter data have been collected to research two social events relevant to the Women's March. My work includes: PostgreSQL Database management, data query, statistic analytics and interactive web mapping. Read More

Comment Analytics: Leveraging Big Unstructured Data to Understand Spatial and Temporal Variations in Public Response to Government Policy: Alan M. MacEachren (PI), Jennifer Baka (CoPI), Prasenjit Mitra (CoPI). Internal (Penn State) Institute for CyberScience Seed Grant. Read More

Natural Language Processing and Network Analysis of GOP Press Releases: Extract the name entities (social actors, locations, time stamps) from the Republic press releases via web scraping (Python), name entity recognition, and geocoding (Java). Conduct network analysis of a co-occurrence matrix, where name entities serve as nodes and their co-occurrences in same releases serve as edges. The walktrap algorithm is deployed to identify communities of the network.Read More

Duplicated project for the Conflict Network Analysis of the Balkan from 1989 to 1999: Determine the bipolar sub-groups of the Balkan conflict network, in which the nodes represent political actors, and edges represent conflictive events happened among actors. The actors are orthogonally projected into the conflict 2D space by maximizing the edges' distances between conflict groups based on the minimal and maximal eigenvalues. A video is created to show the dynamics of the conflict space by the year via the R codes. Read More


My reasearch at University of Pennsylvania, in MUSA:
Spatiotemporal analysis of bike travelling in Chicago: With Divvy system’s bike travelling information in Chicago, the project aims to develop a Python ArcGIS tool to find the hotspots of started and returned Divvy Bike stations for two different groups, customers and subscribers during different time periods. It provides the geo-visualization map with route guidance. Confirm the correlations between group types and bike uses based on travelling and residential spatial analysis.Read More

Retail Site and Trading Area Analysis, Philadelphia: Find the most appropriate locations for building a new retail site by calculating the probability of each grid that consumers would like to visit for. The demographic and local business data, zoning information, streets, and transportation data are collected for the Network Analysis and Spatial Analysis in ArcGIS.Read More

Housing Price Prediction with the Regression Model, Philadelphia: Learn the regression model for predicting the housing sale prices by training the model on the observed data points. The final model involves in 44 socio-economic and spatial variables, which are directly collected from open data sources, or assembled after the spatial analysis in ArcGIS. The predicting result for 30 testing sites of the final regression model helps reveal the key influential factors on the housing sale price.Read More

Web Mapping Application for the Entertainment Site Selection, New York: A web-mapping application is designed for calculating the prospective population growth within the radius of places, specified by users via the Mxml programming language. It calls the vector data uploaded into ArcServer and can support city planning by providing the new out-open space sites of the highest possibilities of population growth.

Google Web Mapping Application for the Restaurant Navigation, Philadelphia: Develop a geo-spatial web application to map all restaurants in Philadelphia and give the best route from the users’ locations to specific restaurants via the HTML and PHP web developing languages. It will provide users with the interacted functions, including selecting restaurants' types, opening time and searching radius from users' locations.

Urban and Environmental Land Cover Change Analysis, Pennsylvania: Develop a series of urban and landscape change indices between 1990 and 2010 for Pennsylvania. Choose appropriate Fragstats metrics to summarize the degree to which urbanized lands, farmland, forest and pasture lands were more or less fragmented in 2011 and predict the suitable areas for development.Read More

Multi-hazard Damage Assessment, Pennsylvania: Using Hazus and ArcGIS software, assess influences of three prospective hazards, including flooding, earthquake and hurricane, for the Buck County. The potential influenced areas and damages are calculated for different blocks based on geographic data downloaded from FEMA.Read More

Three Approaches to Modeling Urban Growth, Atlanta: Build rule-based, cellular automata-type and statistic models in SPSS to allocate projected population and job growth by 2030 via analyzing opportunity and constraints of different land use. This project is designed to find the best suitable place for urban development for Atlanta based on the probability predicted by three models.


My research at Wuhan University in Surveying and Mapping:
Reconstruction of 3D city models for the Pingding Mountain and Jinyang County with a new type of airborne combination of cameras. Image-based 3D modeling and a better texture rendering are achieved via automatically orienting and jointing digital oblique and vertical aerial images



  1. MacEachren, A. M., Caneba, R., Chen, H., Cole, H., Domanico, E., Triozzi, N., Xu, F., & Yang, L. (2018). Is This Statement About A Place? Comparing two perspectives. In proceeding of International Conference on GIScience Short Paper.
  2. Chen, X., Xu, F., & Wang ,W. (2018). Geographic big data’s application in Retailing business. In: Big Data Support of Urban Planning and Management, pp. 157–176. Springer, Cham.
  3. Xu, F., Deng, F., Wan, F., KANG, J., & Zhao, Z. (2013). The realization of 3D buildings reconstruction from oblique images. Science of Surveying and Mapping.

  1. American Association of Geographers (AAG) Annual Conference, High Performance Computing for Big Spatiotemporal Data Mining. Apr. 2018, New Orleans, USA

Computer Skills

Software and Programming Languages related to My Research:


ArcGIS Desktop; ArcGIS Engine; ArcGIS Online, Google Earth GUI; GeoDA; Hazus; WinTR-55; SWMM

Remote Sensing/Imagery

ERDAS; ENVI; AutoCAD; SketchUp

Programming Languages

JAVA; JavaSript; Python; R; Scala; Shell Scripting; SQL; PHP; HTML

High Performance Computing/Database

Apache Spark; Apache Hadoop; MySQL; PostgreSQL; ESRI Geodatabase

Machine Learning

TensorFLow; Scikit-Learn; Genism; Amazon Mechanical Turk


R; SPSS; Matlab; Excel; Fragstats


Google Map API; GeoNames; GeoPy; GeoTxt


Linux; Interactive Web Mapping; Stanford NLP, SNAP


Find me in academic and social media

Contact Me

302 Walker BuildingPennsylvania State University State College, PA United States

Phone: Check this in my CV

My Lab