-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stat tests #5
base: master
Are you sure you want to change the base?
Stat tests #5
Changes from all commits
8ef6e46
bf8390f
15011f9
39260c4
a228638
bcdc37e
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,122 @@ | ||
import pandas as pd | ||
import argparse | ||
import seaborn as sns | ||
import stat_test_utils as stu | ||
import os | ||
|
||
defaultbaseline ="../test_data/V2_500Agents_125StepsSeed1.csv" | ||
defaultpovertythreshold = 1. | ||
|
||
def parse_args(): | ||
""" | ||
The tests can be run using | ||
python stat_test.py | ||
""" | ||
parser = argparse.ArgumentParser() | ||
|
||
parser.add_argument( "--simulation","-s",help="path to simulation output to be evaluated for similarity ", type=str, required=True) | ||
parser.add_argument( "--baseline", "-b", help="Optional. Path to simulation output to be used as baseline", type=str, default=defaultbaseline) | ||
parser.add_argument( "--povertythreshold", "-pt", help="Value for poverty threshold used", type=float,default=defaultpovertythreshold) | ||
|
||
args=parser.parse_args() | ||
return args | ||
|
||
|
||
def load_sim_file(path,*kwargs): | ||
data=[] | ||
if os.path.isfile(path): | ||
try: | ||
data = pd.read_csv(path,*kwargs) | ||
except: | ||
print(f"failed to read {path}") | ||
|
||
else: | ||
print(f"{path} is not a file") | ||
return data | ||
|
||
|
||
def print_stat_test_output(pvalues,properties): | ||
if len(pvalues) != len(properties): | ||
print('mismatch in properties and p values') | ||
else: | ||
for pv, prop in zip(pvalues,properties): | ||
if pv > 0.05: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should this p value be a non-required parameter of the program? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I assume this value is taken from the scipy library example? Or is there a relation to our specific problem? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We could add a p-value argument. p=0.05 is the standard for 2sigma significance on a test. No specific example, just probability theory. It's vanilla hypothesis testing |
||
print(f"Null hypothesis of same parent distribution accepted for {prop} at p = {pv} \n") | ||
else: | ||
print(f"Null hypothesis of same parent distribution rejected for {prop} at p = {pv} \n") | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should there be a global acceptance output as well? As in, the complete test is only accepted when all sub-test are accepted. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I guess so. One point to coonsider here (also pinging @vmgaribay is the possibilty of false negatives and/or different tests weighted towards different parts of the distribution (center vs. tail) which may impact sensitivity. Also: This test is solely for the purposes of testing outcomes while refactoring the a model with no other changes. As soon as one changes inputs and/or algorithmic principle differences my occur. Easy to do an aggregation step now, so I'll implement |
||
|
||
|
||
def perform_stat_test(simulation_file, baseline_file, povertythreshold): | ||
|
||
print('loading data ...\n') | ||
sim_data = load_sim_file(simulation_file) | ||
base_data = load_sim_file(baseline_file) | ||
|
||
""" | ||
calculate derived properties | ||
(i) poverty | ||
""" | ||
|
||
print("calculating derived properties ...\n") | ||
sim_data["InPoverty"] = sim_data["k_t"] < povertythreshold | ||
base_data["InPoverty"] = base_data["k_t"] < povertythreshold | ||
|
||
|
||
sim_total_steps_in_poverty = sim_data.groupby("AgentID").sum("InPoverty")[["InPoverty"]] | ||
sim_max_consec_steps = sim_data.groupby("AgentID").apply(stu.max_consec) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Does the groupby command pass the income values for a single agent to max_consec and iteratively work through all the agents? I ask because in max_consec, values is a 1D variable for "in_poverty" key. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The for max_consec, the input to the the function is this series of groupoed views one by one. he ode then sets Step, instead of AgentID as the index value enabling a loop over all the steps and gent has performed (tthe iloc command executes on the index. because this is a single element (when selecting the "InPoverty" column) the boolean evaluation works. Does this answer your question @cpranav93 ? |
||
|
||
base_total_steps_in_poverty = base_data.groupby("AgentID").sum("InPoverty")[["InPoverty"]] | ||
base_max_consec_steps = base_data.groupby("AgentID").apply(stu.max_consec) | ||
|
||
""" | ||
(ii) Technology regime switches | ||
""" | ||
|
||
sim_switches=sim_data.groupby("AgentID").apply(stu.tally_switches) | ||
|
||
base_switches=base_data.groupby("AgentID").apply(stu.tally_switches) | ||
|
||
""" | ||
join derived properties for further analysis | ||
""" | ||
|
||
sim_derived = sim_total_steps_in_poverty.join(sim_max_consec_steps, on="AgentID").join(sim_switches, on="AgentID") | ||
base_derived = base_total_steps_in_poverty.join(base_max_consec_steps, on="AgentID").join(base_switches, on="AgentID") | ||
|
||
|
||
""" | ||
Perform two sample statistical distribution tests on final distributions of simulated properties | ||
""" | ||
print("performing Cramer von Mises wo sample tests ...\n") | ||
#Cramer von mises | ||
cvm_kt_p, cvm_kt_s = stu.CramerVonMises(sim_data,base_data,'k_t',collateSteps=True) | ||
|
||
cvm_total_poverty_p, cvm_total_poverty_s = stu.CramerVonMises(sim_derived,base_derived,'InPoverty') | ||
cvm_consec_poverty_p, cvm_consec_poverty_s = stu.CramerVonMises(sim_derived,base_derived,'MaxConsec') | ||
|
||
cvm_LtoH_p, cvm_LtoH_statistic = stu.CramerVonMises(sim_derived,base_derived,'LtoH') | ||
cvm_HtoL_p, cvm_HtoL_statistic = stu.CramerVonMises(sim_derived,base_derived,'HtoL') | ||
|
||
|
||
""" | ||
Print results. | ||
|
||
TODO: Tie into CI | ||
""" | ||
print_stat_test_output([cvm_kt_p,cvm_total_poverty_p,cvm_consec_poverty_p,cvm_LtoH_p,cvm_HtoL_p],['k_t','total steps in poverty','max consecutive steps in poverty','switches from L to H technology','switches from H to L technology']) | ||
|
||
|
||
def main(): | ||
args = parse_args() | ||
|
||
simulation_file = args.simulation | ||
baseline_file = args.baseline | ||
povertythreshold = args.povertythreshold | ||
|
||
perform_stat_test(simulation_file, baseline_file, povertythreshold) | ||
|
||
|
||
if __name__ == "__main__": | ||
main() | ||
|
||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,65 @@ | ||
# From agent data calculates maximum consecutive number and total number of time steps | ||
# spent below a set threshold for each agent at t_final | ||
|
||
import pandas as pd | ||
import scipy.stats as st | ||
|
||
|
||
def max_consec(values): | ||
# Compare poverty bool of step and previous step; if both are true, raise the consecutive day tally. | ||
# If the running tally is greater than the current maximum, set a new maximum. | ||
# If there were no consecutive days, return 0, otherwise return the maximum (plus one day). | ||
|
||
values.set_index("Step") | ||
tally,maximum=0,0 | ||
for i in range(1,len(values)): | ||
if(values.iloc[i]["InPoverty"] == True) & (values.iloc[i-1]["InPoverty"] == True): | ||
tally+=1 | ||
else: | ||
if tally > maximum: | ||
maximum=tally | ||
tally=0 | ||
#Incase some one was in proverty till the end, do a check at the end for maximum days (niche case) | ||
if tally > maximum: | ||
maximum=tally | ||
if maximum > 0: | ||
return pd.Series({"MaxConsec":maximum+1}) | ||
return pd.Series({"MaxConsec":0}) | ||
|
||
|
||
|
||
def tally_switches(values): | ||
# Compare technology of step and previous step; if equal, pass to the next step. | ||
# If unequal, add 1 to the Low to High tally if the previous value was "L" or | ||
# 1 to the High to Low tally if the previous value was not "L". Return the tallies. | ||
values.set_index("Step") | ||
LtoHtally,HtoLtally=0,0 | ||
for i in range(1,len(values)): | ||
if values.iloc[i]["technology"]==values.iloc[i-1]["technology"]: | ||
pass | ||
elif values.iloc[i-1]["technology"]=="L": | ||
LtoHtally+=1 | ||
else: | ||
HtoLtally+=1 | ||
|
||
return pd.Series({"LtoH":LtoHtally,"HtoL":HtoLtally}) | ||
|
||
|
||
|
||
def CramerVonMises(sim,base,prop,collateSteps=False): | ||
if collateSteps == True: | ||
maxStepSim = max(sim["Step"]) | ||
maxStepBase = max(base["Step"]) | ||
simProp = sim[sim['Step']==maxStepSim][prop] | ||
baseProp = base[base['Step']==maxStepBase][prop] | ||
else: | ||
simProp = sim[prop] | ||
baseProp = base[prop] | ||
|
||
cvm = st.cramervonmises_2samp(simProp,baseProp) | ||
return cvm.pvalue, cvm.statistic | ||
|
||
|
||
|
||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not very descriptive what the poverty threshold is. Is this a well known concept for the users/academic peers?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right. @vmgaribay shall we elaborate this? I recall, that the way you suggeted is some percentage (~10%) of the mean income. The problem with setting it in that manner for this test, is that a change in the implementation might shift the income distribution. To (also) be sensitve to this the idea was to use a fixed values(TBD) defined on the baseline simulation in a separate step. To make it possible to change this without hacking the code I added this CL argument, but aagree it is not very descriptive. Ideas, suggestions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the threshold of 1 was just serving as a static placeholder; originally, it was to be the bottom 10th percentile, but we were trying to reduce variations. I have also seen a percentage of the median population income used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it not be a good idea to set the poverty threshold either very high or very low so that everyone is above or under it, for testing purposes? This should ensure that any randomness introduced in the simulation doesn't throw the tests out of wack as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cpranav93 I don't think I follow
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vmgaribay, lets say that if we set poverty_threshold to 0, then every agent would always be above poverty line or vice-versa if we set it to a very high value. This way we can deterministically ensure that the poverty line calculation is tested and working...
but as I say this, I realise that this may not be the point of these tests (since they are not unit tests meant to check each and every functionality!). So feel free to follow along on my crazy journey, or just deboard, tuck and roll off!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I see where you were going with it. Yes, this is important for testing the test but not the test itself.