Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stat tests #5

Open
wants to merge 6 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
122 changes: 122 additions & 0 deletions stat_tests/stat_test.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,122 @@
import pandas as pd
import argparse
import seaborn as sns
import stat_test_utils as stu
import os

defaultbaseline ="../test_data/V2_500Agents_125StepsSeed1.csv"
defaultpovertythreshold = 1.

def parse_args():
"""
The tests can be run using
python stat_test.py
"""
parser = argparse.ArgumentParser()

parser.add_argument( "--simulation","-s",help="path to simulation output to be evaluated for similarity ", type=str, required=True)
parser.add_argument( "--baseline", "-b", help="Optional. Path to simulation output to be used as baseline", type=str, default=defaultbaseline)
parser.add_argument( "--povertythreshold", "-pt", help="Value for poverty threshold used", type=float,default=defaultpovertythreshold)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not very descriptive what the poverty threshold is. Is this a well known concept for the users/academic peers?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right. @vmgaribay shall we elaborate this? I recall, that the way you suggeted is some percentage (~10%) of the mean income. The problem with setting it in that manner for this test, is that a change in the implementation might shift the income distribution. To (also) be sensitve to this the idea was to use a fixed values(TBD) defined on the baseline simulation in a separate step. To make it possible to change this without hacking the code I added this CL argument, but aagree it is not very descriptive. Ideas, suggestions?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the threshold of 1 was just serving as a static placeholder; originally, it was to be the bottom 10th percentile, but we were trying to reduce variations. I have also seen a percentage of the median population income used.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it not be a good idea to set the poverty threshold either very high or very low so that everyone is above or under it, for testing purposes? This should ensure that any randomness introduced in the simulation doesn't throw the tests out of wack as well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cpranav93 I don't think I follow

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vmgaribay, lets say that if we set poverty_threshold to 0, then every agent would always be above poverty line or vice-versa if we set it to a very high value. This way we can deterministically ensure that the poverty line calculation is tested and working...

but as I say this, I realise that this may not be the point of these tests (since they are not unit tests meant to check each and every functionality!). So feel free to follow along on my crazy journey, or just deboard, tuck and roll off!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see where you were going with it. Yes, this is important for testing the test but not the test itself.


args=parser.parse_args()
return args


def load_sim_file(path,*kwargs):
data=[]
if os.path.isfile(path):
try:
data = pd.read_csv(path,*kwargs)
except:
print(f"failed to read {path}")

else:
print(f"{path} is not a file")
return data


def print_stat_test_output(pvalues,properties):
if len(pvalues) != len(properties):
print('mismatch in properties and p values')
else:
for pv, prop in zip(pvalues,properties):
if pv > 0.05:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this p value be a non-required parameter of the program?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume this value is taken from the scipy library example? Or is there a relation to our specific problem?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could add a p-value argument. p=0.05 is the standard for 2sigma significance on a test. No specific example, just probability theory. It's vanilla hypothesis testing

print(f"Null hypothesis of same parent distribution accepted for {prop} at p = {pv} \n")
else:
print(f"Null hypothesis of same parent distribution rejected for {prop} at p = {pv} \n")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should there be a global acceptance output as well? As in, the complete test is only accepted when all sub-test are accepted.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess so. One point to coonsider here (also pinging @vmgaribay is the possibilty of false negatives and/or different tests weighted towards different parts of the distribution (center vs. tail) which may impact sensitivity.

Also: This test is solely for the purposes of testing outcomes while refactoring the a model with no other changes. As soon as one changes inputs and/or algorithmic principle differences my occur.

Easy to do an aggregation step now, so I'll implement



def perform_stat_test(simulation_file, baseline_file, povertythreshold):

print('loading data ...\n')
sim_data = load_sim_file(simulation_file)
base_data = load_sim_file(baseline_file)

"""
calculate derived properties
(i) poverty
"""

print("calculating derived properties ...\n")
sim_data["InPoverty"] = sim_data["k_t"] < povertythreshold
base_data["InPoverty"] = base_data["k_t"] < povertythreshold


sim_total_steps_in_poverty = sim_data.groupby("AgentID").sum("InPoverty")[["InPoverty"]]
sim_max_consec_steps = sim_data.groupby("AgentID").apply(stu.max_consec)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the groupby command pass the income values for a single agent to max_consec and iteratively work through all the agents?

I ask because in max_consec, values is a 1D variable for "in_poverty" key.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Groupby command instatieas a series of viewas on the DF object, one for each value of the parameter to be grouped by, and passes these on individually. For the total steps the code then sums only the "InPoverty" field for each grouped view (i.e. agent) and returns only the summed "InPoverty" column.

for max_consec, the input to the the function is this series of groupoed views one by one. he ode then sets Step, instead of AgentID as the index value enabling a loop over all the steps and gent has performed (tthe iloc command executes on the index. because this is a single element (when selecting the "InPoverty" column) the boolean evaluation works.

Does this answer your question @cpranav93 ?


base_total_steps_in_poverty = base_data.groupby("AgentID").sum("InPoverty")[["InPoverty"]]
base_max_consec_steps = base_data.groupby("AgentID").apply(stu.max_consec)

"""
(ii) Technology regime switches
"""

sim_switches=sim_data.groupby("AgentID").apply(stu.tally_switches)

base_switches=base_data.groupby("AgentID").apply(stu.tally_switches)

"""
join derived properties for further analysis
"""

sim_derived = sim_total_steps_in_poverty.join(sim_max_consec_steps, on="AgentID").join(sim_switches, on="AgentID")
base_derived = base_total_steps_in_poverty.join(base_max_consec_steps, on="AgentID").join(base_switches, on="AgentID")


"""
Perform two sample statistical distribution tests on final distributions of simulated properties
"""
print("performing Cramer von Mises wo sample tests ...\n")
#Cramer von mises
cvm_kt_p, cvm_kt_s = stu.CramerVonMises(sim_data,base_data,'k_t',collateSteps=True)

cvm_total_poverty_p, cvm_total_poverty_s = stu.CramerVonMises(sim_derived,base_derived,'InPoverty')
cvm_consec_poverty_p, cvm_consec_poverty_s = stu.CramerVonMises(sim_derived,base_derived,'MaxConsec')

cvm_LtoH_p, cvm_LtoH_statistic = stu.CramerVonMises(sim_derived,base_derived,'LtoH')
cvm_HtoL_p, cvm_HtoL_statistic = stu.CramerVonMises(sim_derived,base_derived,'HtoL')


"""
Print results.

TODO: Tie into CI
"""
print_stat_test_output([cvm_kt_p,cvm_total_poverty_p,cvm_consec_poverty_p,cvm_LtoH_p,cvm_HtoL_p],['k_t','total steps in poverty','max consecutive steps in poverty','switches from L to H technology','switches from H to L technology'])


def main():
args = parse_args()

simulation_file = args.simulation
baseline_file = args.baseline
povertythreshold = args.povertythreshold

perform_stat_test(simulation_file, baseline_file, povertythreshold)


if __name__ == "__main__":
main()


65 changes: 65 additions & 0 deletions stat_tests/stat_test_utils.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
# From agent data calculates maximum consecutive number and total number of time steps
# spent below a set threshold for each agent at t_final

import pandas as pd
import scipy.stats as st


def max_consec(values):
# Compare poverty bool of step and previous step; if both are true, raise the consecutive day tally.
# If the running tally is greater than the current maximum, set a new maximum.
# If there were no consecutive days, return 0, otherwise return the maximum (plus one day).

values.set_index("Step")
tally,maximum=0,0
for i in range(1,len(values)):
if(values.iloc[i]["InPoverty"] == True) & (values.iloc[i-1]["InPoverty"] == True):
tally+=1
else:
if tally > maximum:
maximum=tally
tally=0
#Incase some one was in proverty till the end, do a check at the end for maximum days (niche case)
if tally > maximum:
maximum=tally
if maximum > 0:
return pd.Series({"MaxConsec":maximum+1})
return pd.Series({"MaxConsec":0})



def tally_switches(values):
# Compare technology of step and previous step; if equal, pass to the next step.
# If unequal, add 1 to the Low to High tally if the previous value was "L" or
# 1 to the High to Low tally if the previous value was not "L". Return the tallies.
values.set_index("Step")
LtoHtally,HtoLtally=0,0
for i in range(1,len(values)):
if values.iloc[i]["technology"]==values.iloc[i-1]["technology"]:
pass
elif values.iloc[i-1]["technology"]=="L":
LtoHtally+=1
else:
HtoLtally+=1

return pd.Series({"LtoH":LtoHtally,"HtoL":HtoLtally})



def CramerVonMises(sim,base,prop,collateSteps=False):
if collateSteps == True:
maxStepSim = max(sim["Step"])
maxStepBase = max(base["Step"])
simProp = sim[sim['Step']==maxStepSim][prop]
baseProp = base[base['Step']==maxStepBase][prop]
else:
simProp = sim[prop]
baseProp = base[prop]

cvm = st.cramervonmises_2samp(simProp,baseProp)
return cvm.pvalue, cvm.statistic





Loading