2021-02-26

How to suppress KeyError in Python when dataframe is empty when mapping multiple Foursquare results and an API result is blank

I'm retrieving Foursquare venue data and plotting it on a Folium map. I'm plotting several API call results on the same map.

When the API returns an empty JSON result because there are no queried venues within the search, it throws a KeyError because the code is referencing columns in the dataframe that doesn't exist, because the API result is blank.

I want to continue to display the map with other results, and have the code ignore or suppress instances where the API result is blank.

I've tried try/except/if to test if the dataframe is blank, though cannot figure out how to "ignore the blanks and skip to the next API result".

Any advice would be appreciated.

## Foursquare Query 11 - name origin location
address = 'Convent Station, NJ' ## Try "Madison, NJ" for working location example
geolocator = Nominatim(user_agent="foursquare_agent")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude

## name search parameters
search_query = 'Pharmacy'
radius = 1200

## define corresponding URL
url = 'https://api.foursquare.com/v2/venues/search?client_id={}&client_secret={}&ll={},{}&oauth_token={}&v={}&query={}&radius={}&limit={}'.format(CLIENT_ID, CLIENT_SECRET, latitude, longitude,ACCESS_TOKEN, VERSION, search_query, radius, LIMIT)
results = requests.get(url).json()

## Convert to pandas dataframe
# assign relevant part of JSON to venues
venues = results['response']['venues']
venues
# tranform venues into a dataframe
dataframe = json_normalize(venues)

## Filter results to only areas of interest
# keep only columns that include venue name, and anything that is associated with location
filtered_columns = ['name', 'categories'] + [col for col in dataframe.columns if col.startswith('location.')] + ['id']
dataframe_filtered = dataframe.loc[:, filtered_columns]

# function that extracts the category of the venue
def get_category_type(row):
  try:
        categories_list = row['categories']
  except:
        categories_list = row['venue.categories']
        
  if len(categories_list) == 0:
        return None
  else:
        return categories_list[0]['name']

# filter the category for each row
dataframe_filtered['categories'] = dataframe_filtered.apply(get_category_type, axis=1)

# clean column names by keeping only last term
dataframe_filtered.columns = [column.split('.')[-1] for column in dataframe_filtered.columns]

## Visualize the data
dataframe_filtered.name

    # add the query 11 pharmacies as blue circle markers
for name, lat, lng, label in zip(dataframe_filtered.name, dataframe_filtered.lat, dataframe_filtered.lng, dataframe_filtered.name+" - "+dataframe_filtered.city+", "+dataframe_filtered.state):
  folium.CircleMarker(
        [lat, lng],
        radius=5,
        color='blue',
        popup= label,
        fill = True,
        fill_color='blue',
        fill_opacity=0.6
    ).add_to(venues_map)

    ###
    ###
    ###

 
# display map
print('Location loaded, search parameters defined, url generated, results saved & converted to dataframe, map generated!')
venues_map

Error when API result is blank (there are no Foursquare results in the search radius)

/usr/local/lib/python3.7/dist-packages/ipykernel_launcher.py:21: FutureWarning:

pandas.io.json.json_normalize is deprecated, use pandas.json_normalize instead

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-57-2deac5f680d4> in <module>()
     24 # keep only columns that include venue name, and anything that is associated with location
     25 filtered_columns = ['name', 'categories'] + [col for col in dataframe.columns if col.startswith('location.')] + ['id']
---> 26 dataframe_filtered = dataframe.loc[:, filtered_columns]
     27 
     28 # function that extracts the category of the venue

6 frames
/usr/local/lib/python3.7/dist-packages/pandas/core/indexing.py in _validate_read_indexer(self, key, indexer, axis, raise_missing)
   1296             if missing == len(indexer):
   1297                 axis_name = self.obj._get_axis_name(axis)
-> 1298                 raise KeyError(f"None of [{key}] are in the [{axis_name}]")
   1299 
   1300             # We (temporarily) allow for some missing keys with .loc, except in

KeyError: "None of [Index(['name', 'categories', 'id'], dtype='object')] are in the [columns]"


from Recent Questions - Stack Overflow https://ift.tt/3qZYH5m
https://ift.tt/eA8V8J

No comments:

Post a Comment