Calculating user contributions¶

When gearing up for publication, you will find that due to the collaborative approach of CATMAID a great many people will have contributed to the data you would like to publish. In order to come up with a sensible author list, it is useful to quantify how much a user has contributed to the reconstruction of a set of neurons. CATMAID provides you a broad summary of contributions e.g. via the selection table. Pymaid let’s you fetch that data and more.

First, let’s get a set of olfactory projection neurons to demonstrate user statistics on:

[1]:

import pymaid

# Initialize connection
rm = pymaid.connect_catmaid()

# Fetch some neurons
nl = pymaid.get_neurons('annotation:glomerulus DA1 right')

INFO  : Found 9 skeletons with matching annotation(s) (pymaid)

First some basic stats:

[2]:

cont = pymaid.get_contributor_statistics(nl, separate=True)
cont.head()

[2]:

	skeleton_id	n_nodes	node_contributors	n_presynapses	pre_contributors	n_postsynapses	post_contributors	review_contributors	multiuser_review_minutes	construction_minutes	min_review_minutes
0	61221	7875	{'adamjohn': 2442, 'ranftp': 4109, 'ratliffj':...	404	{'hsuj': 1, 'adamjohn': 10, 'ranftp': 249, 'ha...	128	{'ranftp': 89, 'alij': 1, 'koppenhaverb': 5, '...	{'ratliffj': 2834, 'masoodpanahn': 84, 'koppen...	165	416	135
1	27295	9975	{'koppenhaverb': 102, 'robertsr': 53, 'lovef':...	412	{'robertsr': 32, 'alij': 6, 'schlegelp': 205, ...	59	{'alij': 2, 'heatha': 2, 'hallouc': 2, 'schleg...	{'schlegelp': 517, 'adesinaa': 2586, 'kmecoval...	367	502	313
2	57323	4585	{'kmecoval': 2194, 'robertsr': 159, 'koppenhav...	361	{'calles': 2, 'michaelLingelbach': 1, 'kmecova...	76	{'vallas': 1, 'ranftp': 1, 'edmondsona': 20, '...	{'ratliffj': 2116, 'kmecoval': 350, 'masoodpan...	176	208	84
3	57311	4882	{'kmecoval': 1376, 'robertsr': 24, 'lovef': 75...	371	{'robertsr': 18, 'lovef': 220, 'jamasba': 4, '...	58	{'robertsr': 10, 'ranftp': 1, 'heatha': 1, 'lo...	{'mooree': 2308, 'adamjohn': 1231, 'kmecoval':...	390	241	163
4	57353	4898	{'kmecoval': 1054, 'hsuj': 1, 'koppenhaverb': ...	302	{'hsuj': 1, 'robertsr': 13, 'kmecoval': 17, 's...	24	{'robertsr': 7, 'kmecoval': 2, 'meechank': 13,...	{'adamjohn': 1843, 'sharifin': 1231, 'masoodpa...	177	247	101

For convencience, there is another function that shows contributions per-user instead of per-neuron:

[3]:

by_user = pymaid.get_user_contributions(nl)
by_user.head()

[3]:

	user	nodes	presynapses	postsynapses	nodes_reviewed
0	robertsr	14527	1038	481	4729
1	ranftp	5927	401	132	0
2	kmecoval	5204	87	4	5177
3	hallouc	4267	61	11	8047
4	michaelLingelbach	4253	1	0	0

Sometimes the plain number of nodes (or cable length) are not very helpful. Did a person trace mostly backbone (fast) or did they scramble tracing fine terminal branches (slow)? Personally, I consider time invested to be a better metric:

[4]:

time_inv = pymaid.get_time_invested(nl)
time_inv.head()

[4]:

	total	creation	edition	review
user
robertsr	1113	690	420	198
hallouc	627	183	180	333
koppenhaverb	378	108	15	144
ranftp	357	219	189	0
kmecoval	333	147	72	126

Notice how the order of contributors changes whether we look at e.g. nodes created vs total time invested?

pymaid.get_time_invested() takes timestamps from node creation, edits and reviews and calculates the time invested in minutes. In brief, pymaid sums up the minutes in which a user has performed 10+ actions. You can tweak this behaviour - have a look at the documentation!

Compartmentalizing contributions¶

You might end up publishing only reconstructions in a certain neuropil. In that case will want to subset contribution accordingly.

Let’s first prune the projection neurons to the lateral horn

[5]:

import matplotlib.pyplot as plt

lh = pymaid.get_volume('LH_R')
lh.color = (240,240,240,.5)

nl_lh = nl.prune_by_volume(lh, inplace=False)

fig, ax = pymaid.plot2d([nl_lh, lh], connectors=False, linewidth=1.5)
plt.show()

../_images/source_user_contributions_9_0.png

You must not use pymaid.get_user_contributions() or pymaid.get_contributor_statistics() with neuron fragments as they are not “fragment-safe”. That’s because these functions use only skeleton IDs to query the CATMAID server which is totally oblivious to the local changes to these neurons.

For our purposes, pymaid.get_time_invested() and pymaid.get_team_contributions() are fragment-safe:

[6]:

time_lh = pymaid.get_time_invested(nl_lh)
time_lh.head()

[6]:

	total	creation	edition	review
user
robertsr	588	249	210	177
koppenhaverb	225	96	9	33
lovef	198	0	15	162
schlegelp	171	27	30	66
ranftp	105	69	39	0

There is currently no pymaid wrapper for the number of nodes/connectors/etc per user but that’s no problem:

[7]:

node_details = pymaid.get_node_details(nl_lh.nodes.treenode_id.values)
connector_details = pymaid.get_node_details(nl_lh.connectors.connector_id.values)

[7]:

	node_id	creation_time	creator	edition_time	editor	reviewers	review_times
0	331812	2015-05-12 12:31:00	13	2015-10-20 06:44:00	13	[53, 13]	[2016-05-03 14:43:00, 2015-05-12 12:32:00]
1	331813	2015-05-12 12:31:00	13	2015-10-20 06:44:00	13	[13, 53]	[2015-05-12 12:32:00, 2016-05-03 14:43:00]
2	331814	2015-05-12 12:31:00	13	2015-10-20 06:44:00	13	[13, 53]	[2015-05-12 12:32:00, 2016-05-03 14:43:00]
3	331815	2015-05-12 12:31:00	13	2015-10-20 06:44:00	13	[13, 53]	[2015-05-12 12:32:00, 2016-05-03 14:43:00]
4	331816	2015-05-12 12:31:00	13	2015-10-20 06:44:00	13	[53, 13]	[2016-05-03 14:43:00, 2015-05-12 12:32:00]

Next, map user IDs to names:

[8]:

user_list = pymaid.get_user_list().set_index('id').login.to_dict()

node_details['creator2'] = node_details.creator.map(user_list)
connector_details['creator2'] = connector_details.creator.map(user_list)

node_details.head()

[8]:

	node_id	creation_time	creator	edition_time	editor	reviewers	review_times	creator2
0	331812	2015-05-12 12:31:00	13	2015-10-20 06:44:00	13	[53, 13]	[2016-05-03 14:43:00, 2015-05-12 12:32:00]	adesinaa
1	331813	2015-05-12 12:31:00	13	2015-10-20 06:44:00	13	[13, 53]	[2015-05-12 12:32:00, 2016-05-03 14:43:00]	adesinaa
2	331814	2015-05-12 12:31:00	13	2015-10-20 06:44:00	13	[13, 53]	[2015-05-12 12:32:00, 2016-05-03 14:43:00]	adesinaa
3	331815	2015-05-12 12:31:00	13	2015-10-20 06:44:00	13	[13, 53]	[2015-05-12 12:32:00, 2016-05-03 14:43:00]	adesinaa
4	331816	2015-05-12 12:31:00	13	2015-10-20 06:44:00	13	[53, 13]	[2016-05-03 14:43:00, 2015-05-12 12:32:00]	adesinaa

Group by user and count:

[9]:

import pandas as pd

node_counts = node_details.groupby('creator2').node_id.count()
cn_counts = connector_details.groupby('creator2').node_id.count()

lh_counts = pd.concat([node_counts, cn_counts], axis=1, sort=True).fillna(0).astype(int)

lh_counts.columns=['nodes', 'connectors']

lh_counts.sort_values('nodes', ascending=False).head()

[9]:

	nodes	connectors
robertsr	5527	1091
koppenhaverb	2847	0
ranftp	2496	349
jefferis	947	11
schlegelp	914	333

Contributions by date¶

You might find yourself in a situation where your neurons of interested have already been published in the past. In that case, you could consider crediting only reconstructions that have been done since.

For this, we need to subset neurons to nodes/connectors that have been created/reviewed/edited after a certain date:

[10]:

import numpy as np

after_date = np.datetime64('2017-01-01')

# Get node details
node_details = pymaid.get_node_details(nl.nodes.treenode_id.values)
cn_details = pymaid.get_node_details(nl.connectors.connector_id.values)

# Subset to nodes/connectors created after given date
new_nodes = node_details[node_details.creation_time >= after_date].node_id.values
new_connectors = cn_details[cn_details.creation_time >= after_date].node_id.values

# Subset neurons to the new nodes/connectors
nl_new = nl.copy()
for n in nl_new:
    pymaid.subset_neuron(n, new_nodes, inplace=True, remove_disconnected=False)
    n.connectors = n.connectors[n.connectors.connector_id.isin(new_connectors)]

Let’s visualize this: old neuron in grey, recent additions in red.

[11]:

fig, ax = pymaid.plot2d(nl, color=(.8,.8,.8), connectors=False)
_ = pymaid.plot2d(nl_new, color='r', ax=ax, connectors=False, linewidth=1)

plt.show()

../_images/source_user_contributions_21_0.png

Now that we have subset our neurons of interest, we can use pymaid.get_time_invested() and pymaid.get_team_contributions() just like we did before:

[12]:

time_new = pymaid.get_time_invested(nl_new)
time_new.head()

[12]:

	total	creation	edition	review
user
robertsr	168	81	51	9
schlegelp	30	9	9	9
alij	6	0	0	0
sharifin	6	0	3	0
edmondsona	6	3	3	0

Relevant functions:

`pymaid.get_contributor_statistics`(x[, …])	Retrieve contributor statistics for given skeleton ids.
`pymaid.get_user_contributions`(x[, teams, …])	Return number of nodes and synapses contributed by each user.
`pymaid.get_time_invested`(x[, mode, by, …])	Calculate the time spent working on a set of neurons.
`pymaid.get_team_contributions`(teams[, …])	Get contributions by teams (nodes, reviews, connectors, time invested).