Analyzing Computed Features#

In addition to the raw electrophysiology and morphology data, the Allen Institute also has computed many electrophysiological features about the cells in their data. These features describe the intrinsic electrophysiological properties of the cell. Here, we will demonstrate how to access and analyze these features both across and within cells.

Setup#

#Import all the necessary packages and initalize an instance of the cache
import pandas as pd
from allensdk.core.cell_types_cache import CellTypesCache
from allensdk.api.queries.cell_types_api import CellTypesApi
import matplotlib.pyplot as plt

ctc = CellTypesCache(manifest_file='cell_types/manifest.json')

print('Packages successfully downloaded.')
Packages successfully downloaded.

Below we’ll create pandas dataframes for the electrophysiology data as well as metadata for all of the mouse cells in this dataset. Like the previous notebook, we’ll join these dataframes and set the row indices to be the id column. Unlike the previous notebook, here we’ll specify within get_cells() that we’d only like to use mouse cells. You can change the argument to species = [CellTypesApi.HUMAN] if you’d like to see human cells instead.

mouse_df = pd.DataFrame(ctc.get_cells(species = [CellTypesApi.MOUSE])).set_index('id')
ephys_df = pd.DataFrame(ctc.get_ephys_features()).set_index('specimen_id')
mouse_ephys_df = mouse_df.join(ephys_df)
mouse_ephys_df.head()
---------------------------------------------------------------------------
KeyboardInterrupt                         Traceback (most recent call last)
Cell In[2], line 1
----> 1 mouse_df = pd.DataFrame(ctc.get_cells(species = [CellTypesApi.MOUSE])).set_index('id')
      2 ephys_df = pd.DataFrame(ctc.get_ephys_features()).set_index('specimen_id')
      3 mouse_ephys_df = mouse_df.join(ephys_df)

File ~/anaconda3/envs/jb/lib/python3.11/site-packages/allensdk/core/cell_types_cache.py:131, in CellTypesCache.get_cells(self, file_name, require_morphology, require_reconstruction, reporter_status, species, simple)
    103 """
    104 Download metadata for all cells in the database and optionally return a
    105 subset filtered by whether or not they have a morphology or reconstruction.
   (...)
    126     Must be one of [ CellTypesApi.MOUSE, CellTypesApi.HUMAN ].
    127 """
    129 file_name = self.get_cache_path(file_name, self.CELLS_KEY)
--> 131 cells = self.api.list_cells_api(path=file_name,
    132                                 strategy='lazy',
    133                                 **Cache.cache_json())
    135 if isinstance(reporter_status, string_types):
    136     reporter_status = [reporter_status]

File ~/anaconda3/envs/jb/lib/python3.11/site-packages/allensdk/api/warehouse_cache/cache.py:661, in cacheable.<locals>.decor.<locals>.w(*args, **kwargs)
    658 if decor.post and not 'post in kwargs':
    659     kwargs['post'] = decor.post
--> 661 result = Cache.cacher(func,
    662                       *args,
    663                       **kwargs)
    664 return result

File ~/anaconda3/envs/jb/lib/python3.11/site-packages/allensdk/api/warehouse_cache/cache.py:383, in Cache.cacher(fn, *args, **kwargs)
    380 Manifest.safe_make_parent_dirs(path)
    382 if writer:
--> 383     data = fn(*args, **kwargs)
    384     data = pre(data)
    385     writer(path, data)

File ~/anaconda3/envs/jb/lib/python3.11/site-packages/allensdk/api/queries/cell_types_api.py:69, in CellTypesApi.list_cells_api(self, id, require_morphology, require_reconstruction, reporter_status, species)
     66 if id:
     67     criteria = "[specimen__id$eq%d]" % id
---> 69 cells = self.model_query(
     70     'ApiCellTypesSpecimenDetail', criteria=criteria, num_rows='all')
     72 return cells

File ~/anaconda3/envs/jb/lib/python3.11/site-packages/allensdk/api/queries/rma_api.py:257, in RmaApi.model_query(self, *args, **kwargs)
    217 def model_query(self, *args, **kwargs):
    218     '''Construct and execute a model stage of an RMA query string.
    219 
    220     Parameters
   (...)
    255     response, including the normalized query.
    256     '''
--> 257     return self.json_msg_query(
    258         self.build_query_url(
    259             self.model_stage(*args, **kwargs)))

File ~/anaconda3/envs/jb/lib/python3.11/site-packages/allensdk/api/api.py:164, in Api.json_msg_query(self, url, dataframe)
    147 def json_msg_query(self, url, dataframe=False):
    148     ''' Common case where the url is fully constructed
    149         and the response data is stored in the 'msg' field.
    150 
   (...)
    161         returned data; type depends on dataframe option
    162     '''
--> 164     data = self.do_query(lambda *a, **k: url,
    165                          self.read_data)
    167     if dataframe is True:
    168         warnings.warn("dataframe argument is deprecated", DeprecationWarning)

File ~/anaconda3/envs/jb/lib/python3.11/site-packages/allensdk/api/api.py:204, in Api.do_query(self, url_builder_fn, json_traversal_fn, *args, **kwargs)
    200 api_url = url_builder_fn(*args, **kwargs)
    202 post = kwargs.get('post', False)
--> 204 json_parsed_data = self.retrieve_parsed_json_over_http(api_url, post)
    206 return json_traversal_fn(json_parsed_data)

File ~/anaconda3/envs/jb/lib/python3.11/site-packages/allensdk/api/api.py:369, in Api.retrieve_parsed_json_over_http(self, url, post)
    366 self._log.info("Downloading URL: %s", url)
    368 if post is False:
--> 369     data = json_utilities.read_url_get(
    370         requests.utils.quote(url,
    371                              ';/?:@&=+$,'))
    372 else:
    373     data = json_utilities.read_url_post(url)

File ~/anaconda3/envs/jb/lib/python3.11/site-packages/allensdk/core/json_utilities.py:110, in read_url_get(url)
     93 '''Transform a JSON contained in a file into an equivalent
     94 nested python dict.
     95 
   (...)
    107 the output will be of the corresponding type.
    108 '''
    109 response = urllib_request.urlopen(url)
--> 110 json_string = response.read().decode('utf-8')
    112 return json.loads(json_string)

File ~/anaconda3/envs/jb/lib/python3.11/http/client.py:459, in HTTPResponse.read(self, amt)
    456     return b""
    458 if self.chunked:
--> 459     return self._read_chunked(amt)
    461 if amt is not None:
    462     if self.length is not None and amt > self.length:
    463         # clip the read to the "end of response"

File ~/anaconda3/envs/jb/lib/python3.11/http/client.py:591, in HTTPResponse._read_chunked(self, amt)
    588     self.chunk_left = chunk_left - amt
    589     break
--> 591 value.append(self._safe_read(chunk_left))
    592 if amt is not None:
    593     amt -= chunk_left

File ~/anaconda3/envs/jb/lib/python3.11/http/client.py:630, in HTTPResponse._safe_read(self, amt)
    623 def _safe_read(self, amt):
    624     """Read the number of bytes requested.
    625 
    626     This function should be used when <amt> bytes "should" be present for
    627     reading. If the bytes are truly not available (due to EOF), then the
    628     IncompleteRead exception can be used to detect the problem.
    629     """
--> 630     data = self.fp.read(amt)
    631     if len(data) < amt:
    632         raise IncompleteRead(data, amt-len(data))

File ~/anaconda3/envs/jb/lib/python3.11/socket.py:705, in SocketIO.readinto(self, b)
    703 while True:
    704     try:
--> 705         return self._sock.recv_into(b)
    706     except timeout:
    707         self._timeout_occurred = True

KeyboardInterrupt: 

As you can see if you scroll to the right in the dataframe above, there are many pre-computed features available in this dataset. Here’s a glossary, in case you’re curious.

Image from the Allen Institute Cell Types Database Technical Whitepaper.

Compare Features Across Cells#

The Allen has many precomputed features that you might consider comparing across cells. Some of these features include input resistance (input_resistance_mohm), Adaptation ratio (adaptation), Average interspike interval (avg_isi), and many others. We’ve compiled a complete glossary for you.

To compare cell types, we can subset our electrophysiology dataframe for a specific transgenic line, structure layer, brain area, and more. Below, we’ll create two dataframes to compare cells with spiny dendrites to those with aspiny dendrites. While most excitatory cells are spiny, most inhibitory cells are aspiny.

# Define your cell type variables below
cell_type1 = 'spiny'
cell_type2 = 'aspiny'

# Create our dataframes from our cell types
mouse_spiny_df = mouse_ephys_df[mouse_ephys_df['dendrite_type'] == cell_type1]
mouse_aspiny_df = mouse_ephys_df[mouse_ephys_df['dendrite_type'] == cell_type2]

Now that we have two cell types we would like to compare, we can now use the precomputed features to plot some our cells’ characteristics. Let’s start by using a boxplot to compare the input resistance between our two cell types.

# Select our pre computed feature that we would like to compare 
feature = 'input_resistance_mohm'

# Get the pandas series for our feature from each dataframe. Drop any NaN values.
clean_spiny = mouse_spiny_df[feature].dropna()
clean_aspiny = mouse_aspiny_df[feature].dropna()

# Plot our figure and provide labels
plt.boxplot([clean_spiny, clean_aspiny])
plt.ylabel('Input Resistance (MOhm)')
plt.xticks([1,2], [cell_type1, cell_type2])
plt.title(feature + ' in ' + cell_type1 + ' and ' + cell_type2 + ' cells')

# Show our plot 
plt.show()
../_images/1b52e4171c824d5a4386e450126160248ec5585846e6d2ba2508077ff444b306.png

Compare Features Within Cells#

The power in this dataset is not only the ability to compare two cell types, but to look across all of the data for trends that emerge. Even if we dig into the weeds of the action potential shape, we can make some interesting observations.

Let’s look at the speed of the trough, and the ratio between the upstroke and downstroke of the action potential:

  • Action potential fast trough (fast_trough_v_long_square): Minimum value of the membrane potential in the interval lasting 5 ms after the peak.

  • Upstroke/downstroke ratio (upstroke_downstroke_ratio_long_square): The ratio between the absolute values of the action potential peak upstroke and the action potential peak downstroke.

The cell below will dig up the dendrite type of these cells and add that to our dataframe. Then, it’ll create a scatterplot to compare the depth of the trough with the upstroke:downstroke ratio, where each dot is colored by dendrite type.

# Create our plot! Calling scatter twice like this will draw both of these on the same plot.
plt.scatter(mouse_spiny_df['fast_trough_v_long_square'],mouse_spiny_df['upstroke_downstroke_ratio_long_square'])
plt.scatter(mouse_aspiny_df['fast_trough_v_long_square'],mouse_aspiny_df['upstroke_downstroke_ratio_long_square'])

plt.ylabel('upstroke-downstroke ratio')
plt.xlabel('fast trough depth (mV)')
plt.legend(['Spiny','Aspiny'])
    
plt.show()
../_images/e1f2b159c8c2bfe1f0d736e2ec53476efdd354cf0251d1441dc8ffbf831cae4f.png

This is the true power of neural data science! It looks like the two clusters in the data partially relate to the dendritic type. Cells with spiny dendrites (which are typically excitatory cells) have a big ratio of upstroke:downstroke, and a more shallow trough (less negative). Cells with aspiny dendrites (typically inhibitory cells) are a little bit more varied. But only aspiny cells have a low upstroke:downstroke ratio and a deeper trough (more negative).