index.xml

<?xml version="1.0" encoding="utf-8" standalone="yes" ?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <title>Eric Cramer</title>
    <link>http://emcramer.github.io/</link>
      <atom:link href="http://emcramer.github.io/index.xml" rel="self" type="application/rss+xml" />
    <description>Eric Cramer</description>
    <generator>Source Themes Academic (https://sourcethemes.com/academic/)</generator><language>en-us</language><copyright>© `2021`</copyright><lastBuildDate>Fri, 23 Apr 2021 09:00:00 -0800</lastBuildDate>
    <image>
      <url>http://emcramer.github.io/images/icon_hud9b055eee01d4de0eb2efdace9b98bee_9571_512x512_fill_lanczos_center_2.png</url>
      <title>Eric Cramer</title>
      <link>http://emcramer.github.io/</link>
    </image>
    
    <item>
      <title>The association of the Stanford Expectations of Treatment Scale (SETS) with expectations on pain and opioid dose in a patient-centered prescription opioid tapering program</title>
      <link>http://emcramer.github.io/talk/aapm2021/</link>
      <pubDate>Fri, 23 Apr 2021 09:00:00 -0800</pubDate>
      <guid>http://emcramer.github.io/talk/aapm2021/</guid>
      <description>&lt;p&gt;The Stanford Expectations of Treatment Scale (SETS) is a tool developed to measure patient outcome expectancy prior to treatment. It has been validated in patients receiving surgical and pain interventions, but its relationship with expectancies regarding opioid use tapering has not previously been examined. We aim to characterize the relationship between the SETS scores and patient expectancy regarding opioid tapering and pain levels post tapering.&lt;/p&gt;
</description>
    </item>
    
    <item>
      <title>Acute Pain Predictors of Remote Postoperative Pain Resolution After Hand Surgery</title>
      <link>http://emcramer.github.io/publication/hah-2021/</link>
      <pubDate>Sun, 18 Apr 2021 00:00:00 +0000</pubDate>
      <guid>http://emcramer.github.io/publication/hah-2021/</guid>
      <description>&lt;script type=&#34;text/javascript&#34; src=&#34;https://d1bxh8uas1mnw7.cloudfront.net/assets/embed.js&#34;&gt;&lt;/script&gt;
&lt;div class=&#34;altmetric-embed&#34; data-badge-type=&#34;donut&#34; data-altmetric-id=&#34;104265284&#34; data-doi=&#34;10.1007/s40122-021-00263-y&#34;&gt;&lt;/div&gt;
</description>
    </item>
    
    <item>
      <title>Development and validation of the Collaborative Health Outcomes Information Registry body map</title>
      <link>http://emcramer.github.io/publication/scherrer-2021/</link>
      <pubDate>Sun, 24 Jan 2021 16:44:05 -0800</pubDate>
      <guid>http://emcramer.github.io/publication/scherrer-2021/</guid>
      <description>&lt;script type=&#34;text/javascript&#34; src=&#34;https://d1bxh8uas1mnw7.cloudfront.net/assets/embed.js&#34;&gt;&lt;/script&gt;&lt;div class=&#34;altmetric-embed&#34; data-badge-type=&#34;donut&#34; data-altmetric-id=&#34;98609954&#34; /&gt;
&lt;p&gt;&lt;strong&gt;Introduction:&lt;/strong&gt; 
Critical for the diagnosis and treatment of chronic pain is the anatomical distribution of pain. Several body maps allow patients to indicate pain areas on paper; however, each has its limitations.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Objectives:&lt;/strong&gt; 
To provide a comprehensive body map that can be universally applied across pain conditions, we developed the electronic Collaborative Health Outcomes Information Registry (CHOIR) self-report body map by performing an environmental scan and assessing existing body maps.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Methods:&lt;/strong&gt; 
After initial validation using a Delphi technique, we compared (1) pain location questionnaire responses of 530 participants with chronic pain with (2) their pain endorsements on the CHOIR body map (CBM) graphic. A subset of participants (n = 278) repeated the survey 1 week later to assess test–retest reliability. Finally, we interviewed a patient cohort from a tertiary pain management clinic (n = 28) to identify reasons for endorsement discordances.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Results:&lt;/strong&gt; 
The intraclass correlation coefficient between the total number of body areas endorsed on the survey and those from the body map was 0.86 and improved to 0.93 at follow-up. The intraclass correlation coefficient of the 2 body map graphics separated by 1 week was 0.93. Further examination demonstrated high consistency between the questionnaire and CBM graphic (&amp;lt;10% discordance) in most body areas except for the back and shoulders (≈15–19% discordance). Participants attributed inconsistencies to misinterpretation of body regions and laterality, the latter of which was addressed by modifying the instructions.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Conclusions:&lt;/strong&gt; 
Our data suggest that the CBM is a valid and reliable instrument for assessing the distribution of pain.&lt;/p&gt;
</description>
    </item>
    
    <item>
      <title>COVID-19 Tracker</title>
      <link>http://emcramer.github.io/project/covid19tracker/</link>
      <pubDate>Sat, 28 Mar 2020 11:36:43 -0800</pubDate>
      <guid>http://emcramer.github.io/project/covid19tracker/</guid>
      <description>&lt;h1 id=&#34;tracking-the-covid-19-pandemic&#34;&gt;Tracking the COVID-19 Pandemic&lt;/h1&gt;
&lt;p&gt;After my county instituted a shelter-in-place lockdown due to the COVID-19 pandemic (caused by the SARS-CoV-2 virus), I was curious to know how bad the situation was. The nonpartisan website &lt;a href=&#34;https://usafacts.org/&#34;&gt;usafacts.org/&lt;/a&gt; publishes national, state, and county level data for the United States, and provides a substantial amount of information for making your own calculations and conclusions.&lt;/p&gt;
&lt;p&gt;I am particularly interested in the &lt;strong&gt;change rate&lt;/strong&gt;, the change in the number of new cases of COVID-19 per day. This simple calculation gives an estimate of how quickly the virus is spreading, and whether measures put in place to curb the disease are working.&lt;/p&gt;
&lt;p&gt;I intend to add other metrics to this project, such as R0 calculations and forecasting.&lt;/p&gt;
&lt;iframe width=&#34;720&#34; height=&#34;480&#34; scrolling=&#34;no&#34; frameborder=&#34;no&#34;  src=&#34;https://eric-cramer.shinyapps.io/covid19tracker/&#34;&gt;&lt;/iframe&gt;
&lt;p&gt;Open the tracking tool in a separate window &lt;a href=&#34;https://eric-cramer.shinyapps.io/covid19tracker/&#34;&gt;here&lt;/a&gt;. GitHub repository &lt;a href=&#34;https://github.com/emcramer/covid19project&#34;&gt;here&lt;/a&gt;.&lt;/p&gt;
</description>
    </item>
    
    <item>
      <title>Multiple linear regression in a distributed system</title>
      <link>http://emcramer.github.io/post/multiple-linear-regression-in-a-distributed-system/</link>
      <pubDate>Sat, 15 Feb 2020 00:00:00 +0000</pubDate>
      <guid>http://emcramer.github.io/post/multiple-linear-regression-in-a-distributed-system/</guid>
      <description>&lt;p&gt;&lt;img src=&#34;http://emcramer.github.io/post/2020-02-15-multiple-linear-regression-in-a-distributed-system_files/linear_regression.png&#34; alt=&#34;Credit: XKCD&#34;&gt;&lt;/p&gt;
&lt;p&gt;In a previous &lt;a href=&#34;https://emcramer.github.io/post/starting-distributed-computing/&#34;&gt;post&lt;/a&gt; I talked about adapting a linear regression algorithm so it can be used in a distributed system. Essentially, a master computer oversees computations run on local data, and the algorithm pauses midway through to send summary statistics to the master. In this way, the master receives enough information to reconstruct the model without seeing the underlying data.&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;http://emcramer.github.io/post/2020-02-15-multiple-linear-regression-in-a-distributed-system_files/example-distcomp.png&#34; alt=&#34;Example distributed computation system.&#34;&gt;&lt;/p&gt;
&lt;p&gt;For a linear regression model, we can simply have the master iteratively pass candidate &lt;code&gt;\(\beta\)&lt;/code&gt;s values to to the workers, which then return their local sum of the residual squares. By minimizing the sum of the squared residuals on each iteration, the master can find the optimal values for the &lt;code&gt;\(\beta\)&lt;/code&gt;s in &lt;code&gt;\(y=\beta_0 + \beta_1x\)&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;We can expand this from simple linear regression with a single predictor to multiple linear prediction with several predictors ($y=\beta_0 + \beta_1x + &amp;hellip; +\beta_nx$). The sum of the squared residuals is simply the summary statistic of a matrix operation. Therefore a master controller can pass a vector of &lt;code&gt;\(\beta\)&lt;/code&gt; parameters to the workers on each iteration and receive the RSS in return:&lt;/p&gt;
&lt;p&gt;&lt;code&gt;$$RSS_{local}=\sum_{i=1}^n{\begin{bmatrix} x_{1,1} &amp;amp; ... &amp;amp; x_{1,n} &amp;amp; \\   ... &amp;amp; ... &amp;amp; ... &amp;amp; \\  x_{m,1} &amp;amp; ... &amp;amp; x_{m,n} &amp;amp;  \end{bmatrix}\times\begin{bmatrix} \beta_0 \\ ... \\ \beta_m \end{bmatrix}}$$&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;With a few minor changes in &lt;a href=&#34;https://rextester.com/TDCUWC73705&#34;&gt;code&lt;/a&gt; from my previous post, we can adjust our loss function to accomodate a vector of &lt;code&gt;\(\beta\)&lt;/code&gt;s.&lt;/p&gt;
&lt;pre&gt;&lt;code class=&#34;language-r&#34;&gt;# define a residual sum of squares function to handle multiple sites
multi.min.RSS &amp;lt;- function(sites, par){
  rs &amp;lt;- 0
  # calculate the residuals from each data source
  for(site in sites){
    tmp_mat &amp;lt;-as.matrix(site$data[,1:(ncol(site$data)-2)])
    tmps &amp;lt;- par[1] + tmp_mat%*%par[-1] - site$data$y
    rs &amp;lt;- rs + sum(tmps^2)
  }
  # return the square and sum of the residuals
  return(rs)
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Now all we need to do is simulate some multi-variate data and test everything out.&lt;/p&gt;
&lt;pre&gt;&lt;code class=&#34;language-r&#34;&gt;# general function for simulating a sample data set given parameters
sim.data &amp;lt;- function(mu, sig, amt, seed, mpar, nl){
  # Simulate data for the practice 
  set.seed(seed)
  x &amp;lt;- replicate(length(mpar)-1, rnorm(n=amt, mean=mu, sd=sig))
  
  # create the &amp;quot;true&amp;quot; equation for the regression
  a.true &amp;lt;- mpar[-1]
  b.true &amp;lt;- mpar[1]
  y &amp;lt;- x%*%a.true+b.true
  
  # set the noise level
  noise &amp;lt;- rnorm(n=amt, mean=0, sd=nl)
  d &amp;lt;- data.frame(x
                  , &amp;quot;y_true&amp;quot;=y
                  , &amp;quot;y&amp;quot;=y + noise)
  return(d)
}

true_vals &amp;lt;- c(2,4,6,8)

sim.data1 &amp;lt;-sim.data(10,2,100,2020,true_vals,1)
sim.data2 &amp;lt;- sim.data(10,2,100,2019,true_vals,1)
sites &amp;lt;- list(site1 = list(data=sim.data1), site2 = list(data=sim.data2))

knitr::kable(head(sim.data1))
&lt;/code&gt;&lt;/pre&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th align=&#34;right&#34;&gt;X1&lt;/th&gt;
&lt;th align=&#34;right&#34;&gt;X2&lt;/th&gt;
&lt;th align=&#34;right&#34;&gt;X3&lt;/th&gt;
&lt;th align=&#34;right&#34;&gt;y_true&lt;/th&gt;
&lt;th align=&#34;right&#34;&gt;y&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td align=&#34;right&#34;&gt;10.753944&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;6.542432&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;8.540934&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;152.5978&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;153.5145&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td align=&#34;right&#34;&gt;10.603097&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;8.017478&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;11.702755&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;186.1393&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;185.9124&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td align=&#34;right&#34;&gt;7.803954&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;8.828989&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;9.207017&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;159.8459&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;161.0281&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td align=&#34;right&#34;&gt;7.739188&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;10.767043&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;10.813357&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;184.0659&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;185.5874&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td align=&#34;right&#34;&gt;4.406931&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;11.493330&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;7.922893&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;151.9708&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;153.4088&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td align=&#34;right&#34;&gt;11.441147&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;8.143158&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;7.488237&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;156.5294&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;158.8567&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;pre&gt;&lt;code class=&#34;language-r&#34;&gt;knitr::kable(head(sim.data2))
&lt;/code&gt;&lt;/pre&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th align=&#34;right&#34;&gt;X1&lt;/th&gt;
&lt;th align=&#34;right&#34;&gt;X2&lt;/th&gt;
&lt;th align=&#34;right&#34;&gt;X3&lt;/th&gt;
&lt;th align=&#34;right&#34;&gt;y_true&lt;/th&gt;
&lt;th align=&#34;right&#34;&gt;y&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td align=&#34;right&#34;&gt;11.477045&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;8.309900&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;11.441690&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;189.3011&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;189.1410&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td align=&#34;right&#34;&gt;8.970479&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;11.715855&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;9.210739&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;181.8630&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;181.7065&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td align=&#34;right&#34;&gt;6.719637&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;8.632787&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;11.965325&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;176.3979&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;177.0496&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td align=&#34;right&#34;&gt;11.832074&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;9.978611&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;7.191396&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;166.7311&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;167.2667&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td align=&#34;right&#34;&gt;7.465036&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;7.191671&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;11.600407&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;167.8134&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;168.0076&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td align=&#34;right&#34;&gt;11.476496&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;12.783553&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;8.498515&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;192.5954&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;192.8948&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;We can test our loss function with a call to &lt;code&gt;optim&lt;/code&gt; and compare the results to the base R linear modeling function (and the true values of our simulation).&lt;/p&gt;
&lt;pre&gt;&lt;code class=&#34;language-r&#34;&gt;param.fit &amp;lt;- optim(par=c(0,0,0,0),
                   fn = multi.min.RSS,
                   hessian = TRUE,
                   sites=sites)

# stack the data frames vertically for later verification
sim.data3 &amp;lt;- as.data.frame(rbind(sim.data1, sim.data2)) 
mlm &amp;lt;- lm(y~., data=sim.data3[,-4])

d &amp;lt;- data.frame(&amp;quot;True Betas&amp;quot;=true_vals
           , &amp;quot;Base R Coefficients&amp;quot;=coef(mlm)
           , &amp;quot;Distributed Coefficients&amp;quot;=param.fit$par)
knitr::kable(d)
&lt;/code&gt;&lt;/pre&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th align=&#34;left&#34;&gt;&lt;/th&gt;
&lt;th align=&#34;right&#34;&gt;True.Betas&lt;/th&gt;
&lt;th align=&#34;right&#34;&gt;Base.R.Coefficients&lt;/th&gt;
&lt;th align=&#34;right&#34;&gt;Distributed.Coefficients&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td align=&#34;left&#34;&gt;(Intercept)&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;2&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;2.604539&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;2.642250&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td align=&#34;left&#34;&gt;X1&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;4&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;4.018984&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;4.021583&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td align=&#34;left&#34;&gt;X2&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;6&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;5.936988&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;5.935438&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td align=&#34;left&#34;&gt;X3&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;8&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;7.978939&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;7.973534&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;Not far off!&lt;/p&gt;
&lt;p&gt;You can run the full code &lt;a href=&#34;https://rextester.com/PGLE10656&#34;&gt;here&lt;/a&gt;.&lt;/p&gt;
</description>
    </item>
    
    <item>
      <title>Predicting CRPS Limb Affectation</title>
      <link>http://emcramer.github.io/project/apa2018/</link>
      <pubDate>Tue, 21 Jan 2020 16:16:25 -0800</pubDate>
      <guid>http://emcramer.github.io/project/apa2018/</guid>
      <description>&lt;h1 id=&#34;predicting-crps-limb-affectation-from-physical-and-psychological-factors&#34;&gt;Predicting CRPS Limb Affectation from Physical and Psychological Factors&lt;/h1&gt;
&lt;p&gt;&lt;a href=&#34;https://www.ninds.nih.gov/Disorders/Patient-Caregiver-Education/Fact-Sheets/Complex-Regional-Pain-Syndrome-Fact-Sheet&#34;&gt;Complex Regional Pain Syndrome (CRPS)&lt;/a&gt; is a severe and rare chronic pain condition that often spreads from an initially affected limb to other parts of the body. The underlying etiology and factors that influence the spread of CRPS are not well understood. Previous research has sought to explain the mechanism, onset, and pain severity of CRPS, however, the contribution of psychosocial factors to CRPS affectation has not been investigated fully.&lt;/p&gt;
&lt;p&gt;I extracted data from the &lt;a href=&#34;https://choir.stanford.edu/&#34;&gt;Collaborative Health Outcomes Information Registry (CHOIR)&lt;/a&gt;, which is an electronic patient registry and learning health system. Then I trained a random forest model to describe the role of psychophysical and psychosocial factors as predictors of CRPS limb affectation. To train my model, I defined several &amp;ldquo;classes&amp;rdquo; of CRPS affectation such as ipsilateral and contralateral spread.&lt;/p&gt;
&lt;p&gt;I presented my findings at the American Psychological Association&#39;s 126th annual convention in San Francisco, and my presentation received the Society for Health Psychology&#39;s Oustanding Poster Presentation award.&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;apa-sfhp-award-2.jpg&#34; alt=&#34;Award certificate&#34;&gt;&lt;/p&gt;
</description>
    </item>
    
    <item>
      <title>Modeling Calmodulin</title>
      <link>http://emcramer.github.io/project/bmi214/</link>
      <pubDate>Tue, 21 Jan 2020 16:03:23 -0800</pubDate>
      <guid>http://emcramer.github.io/project/bmi214/</guid>
      <description>&lt;h1 id=&#34;modeling-calmodulin&#34;&gt;Modeling Calmodulin&lt;/h1&gt;
&lt;p&gt;One of our projects in &lt;a href=&#34;http://explorecourses.stanford.edu/search?view=catalog&amp;filter-coursestatus-Active=on&amp;page=0&amp;catalog=&amp;academicYear=20152016&amp;q=cs+107e&amp;collapse=&#34;&gt;BMI 214 (Representational Algorithms for Molecular Biology)&lt;/a&gt; taught by &lt;a href=&#34;https://en.wikipedia.org/wiki/Russ_Altman&#34;&gt;Dr. Russ Altman&lt;/a&gt;, was to create a molecular dynamics simulation of a protein. &lt;a href=&#34;https://en.wikipedia.org/wiki/Molecular_dynamics&#34;&gt;Molecular Dynamics&lt;/a&gt; is using computers to simulate the interaction of atoms and molecules for a period of time under known laws of physics. I represented a fragment of the protein &lt;a href=&#34;https://en.wikipedia.org/wiki/Calmodulin&#34;&gt;calmodulin&lt;/a&gt; by simulating the interactions of its component atoms (each atom within each amino acid). This entails modeling (simulating) how each atom&#39;s mass, velocity, energy, and forces change over time, and calculating the effect of the previous moment in time on the subsequent moment.&lt;/p&gt;
</description>
    </item>
    
    <item>
      <title>Exon Mutability Score</title>
      <link>http://emcramer.github.io/project/bmi273/</link>
      <pubDate>Tue, 21 Jan 2020 11:36:43 -0800</pubDate>
      <guid>http://emcramer.github.io/project/bmi273/</guid>
      <description>&lt;h1 id=&#34;class-project&#34;&gt;Class Project&lt;/h1&gt;
&lt;p&gt;For the final project of &lt;a href=&#34;http://explorecourses.stanford.edu/search?view=catalog&amp;amp;filter-coursestatus-Active=on&amp;amp;page=0&amp;amp;catalog=&amp;amp;academicYear=20172018&amp;amp;q=biomedin273a&amp;amp;collapse=&#34;&gt;BMI 273A (The Human Genome Source Code)&lt;/a&gt; taught by &lt;a href=&#34;http://bejerano.stanford.edu/pi.html&#34;&gt;Dr. Gill Bejarano&lt;/a&gt;, I worked with a group to devise a statistic to measure how &amp;ldquo;mutable&amp;rdquo; a given exon is (some mutations impact the function of an exoonic product, while others do not, ie. synonymous vs. nonsynonymous mutations). We used the &lt;a href=&#34;http://exac.broadinstitute.org/&#34;&gt;ExAC&lt;/a&gt; dataset to isolate exons and calculate an &lt;em&gt;EMTM&lt;/em&gt; score, our version of the &lt;a href=&#34;https://doi.org/10.1371/journal.pgen.1003709&#34;&gt;RVIS score&lt;/a&gt;. We presented our metric, and maintain the results, explorations, and code on &lt;a href=&#34;https://github.com/ostrowr/cs273a-project&#34;&gt;GitHub&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;rvis-lollipop.png&#34; alt=&#34;Exon mutabilities of each chromosome based on our mutability score.&#34;&gt;&lt;/p&gt;
</description>
    </item>
    
    <item>
      <title>Starting distributed computing</title>
      <link>http://emcramer.github.io/post/starting-distributed-computing/</link>
      <pubDate>Thu, 16 Jan 2020 00:00:00 +0000</pubDate>
      <guid>http://emcramer.github.io/post/starting-distributed-computing/</guid>
      <description>


&lt;div id=&#34;quick-intro&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Quick Intro&lt;/h2&gt;
&lt;p&gt;As hospitals, care providers, and private companies collect more data, they develop rich databases that can be used to improve patient care (e.g. through precision medicine). Research institutions often cannot share their data with each other, however, out of privacy concerns and HIPAA compliance. This poses a hurdle to inter-institutional collaboration, and creates a research bottleneck. It is an unfortunate instance where good data security practices can create roadblocks to inter-institutional collaboration, which has the potential to solve problems such as &lt;a href=&#34;https://medium.com/better-programming/bias-racist-robots-and-ai-the-problems-in-the-coding-that-coders-fail-to-see-305f6f324793&#34;&gt;bias in AI&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;One way we may circumvent this issue is with distributed computation. Through an appropriately configured distributed computing service, it is possible to fit models on data that match by &lt;a href=&#34;https://datacarpentry.org/stata-economics/img/append-merge.png&#34;&gt;stacking vertically&lt;/a&gt;, but is otherwise electronically separate.&lt;/p&gt;
&lt;div class=&#34;figure&#34;&gt;
&lt;img src=&#34;https://datacarpentry.org/stata-economics/img/append-merge.png&#34; alt=&#34;&#34; /&gt;
&lt;p class=&#34;caption&#34;&gt;Partitioning data&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;The underlying premise is that most (all?) computations for modelling require multiple steps. If we take values calculated during an intermediate step of a computation performed at individual sites, then we can aggregate these values in a central location to produce a final model. Since the sites don’t talk to each other, and the central location only receives a summary statistic, &lt;strong&gt;none of the underlying information gets shared&lt;/strong&gt;. Each institution or entity can hold on to its data and maintain its security while still helping the common good.&lt;/p&gt;
&lt;p&gt;That is the vision and purpose of the &lt;a href=&#34;https://cran.r-project.org/web/packages/distcomp/index.html&#34;&gt;&lt;code&gt;distcomp&lt;/code&gt;&lt;/a&gt; R package, which makes the distributed computation process simpler through a series of GUIs that walk a user through the process. The only hang up is there are currently very few options for computations optimized for this distribution process.&lt;/p&gt;
&lt;p&gt;That is why in this post, I am going to go through prototyping a distributed computation before adding it to the distcomp package. I am going to start small, by adding a linear regression (which was not included in the initial list of possible distributed computations).&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;prototyping-locally&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Prototyping Locally&lt;/h2&gt;
&lt;p&gt;To do a proper distributed computation, you need to have multiple &lt;em&gt;sites&lt;/em&gt; and a &lt;em&gt;master&lt;/em&gt; controlling instance. This involves configuring a server or VM. But you don’t really need to do that to &lt;em&gt;prototype&lt;/em&gt; a computation and make sure you can integrate values at some intermediate step.&lt;/p&gt;
&lt;p&gt;Consider computing the linear regression for some data set by minimizing the residual sum of squares. Given some set of data points &lt;span class=&#34;math inline&#34;&gt;\((x_i, y_i), i=1,...,n\)&lt;/span&gt;, we obtain a residual (error) value in prediction with a model &lt;span class=&#34;math inline&#34;&gt;\(r_i = f(x_i, \beta)\)&lt;/span&gt;. If this model is linear, &lt;span class=&#34;math inline&#34;&gt;\(f(x) = \beta_1 x + \beta_0\)&lt;/span&gt;, then we can optimize it by minimizing the sum of the residuals: &lt;span class=&#34;math inline&#34;&gt;\(\sum_{i=1}^n r_i^2\)&lt;/span&gt;.&lt;/p&gt;
&lt;p&gt;It is at this step that we can split up the computation. We have the master send out the parameters (e.g. &lt;span class=&#34;math inline&#34;&gt;\(\beta_0, \beta_1\)&lt;/span&gt;, etc.) for the computation to each of the participating sites. I can re-create this scenario locally by simulating two separate data sets.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;# general function for simulating a sample data set given parameters
sim.data &amp;lt;- function(mu, sig, amt, seed, mpar, nl){
  # Simulate data for the practice 
  set.seed(seed)
  x &amp;lt;- rnorm(n=amt, mean=mu, sd=sig)
  
  # create the &amp;quot;true&amp;quot; equation for the regression
  a.true &amp;lt;- mpar[1]
  b.true &amp;lt;- mpar[2]
  y &amp;lt;- x*a.true+b.true
  
  # set the noise level
  noise &amp;lt;- rnorm(n=amt, mean=0, sd=nl)
  d &amp;lt;- cbind(x,y,y + noise)
  colnames(d) &amp;lt;- c(&amp;quot;x&amp;quot;, &amp;quot;y_true&amp;quot;,&amp;quot;y&amp;quot;)
  return(as.data.frame(d))
}

sim.data1 &amp;lt;-sim.data(10,2,100,2020,c(2,8),1)
sim.data2 &amp;lt;- sim.data(10,2,100,2019,c(2,8),1)
sites &amp;lt;- list(site1 = list(data=sim.data1), site2 = list(data=sim.data2))

head(sim.data1)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;##           x   y_true        y
## 1 10.753944 29.50789 27.77910
## 2 10.603097 29.20619 28.21493
## 3  7.803954 23.60791 23.02240
## 4  7.739188 23.47838 23.86190
## 5  4.406931 16.81386 17.56053
## 6 11.441147 30.88229 29.95387&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;head(sim.data2)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;##           x   y_true        y
## 1 11.477045 30.95409 30.10904
## 2  8.970479 25.94096 26.79889
## 3  6.719637 21.43927 20.75567
## 4 11.832074 31.66415 31.65345
## 5  7.465036 22.93007 21.52591
## 6 11.476496 30.95299 32.34477&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Here we simulate two data sets with 100 observations, a mean of 10, and a standard deviation of 2. The true values for the linear model are a slope of 2 and an intercept of 8.&lt;/p&gt;
&lt;p&gt;Then each site will calculate the residuals from its own data and send back the summary statistic - the sum of its squared residuals.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;# define a residual sum of squares function to handle multiple sites
multi.min.RSS &amp;lt;- function(sites, par){
  rs &amp;lt;- 0
  # calculate the residuals from each data source
  for(site in sites){
    tmps &amp;lt;- par[1] + par[2] * site$data$x - site$data$y
    rs &amp;lt;- rs + sum(tmps^2) #c(rs, tmps)
  }
  # return the square and sum of the residuals
  return(rs)
}&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;All that is left to do is solve for each site. We can use base R’s &lt;code&gt;optim&lt;/code&gt; function to do this.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;param.fit &amp;lt;- optim(par=c(0,1),
                   fn = multi.min.RSS,
                   hessian = TRUE,
                   sites=sites)
print(&amp;quot;Distributed linear model results:&amp;quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [1] &amp;quot;Distributed linear model results:&amp;quot;&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;print(paste(&amp;quot;Intercept: &amp;quot;, param.fit$par[1], &amp;quot; Slope: &amp;quot;, param.fit$par[2]))&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [1] &amp;quot;Intercept:  7.77183635600249  Slope:  2.00840909246344&amp;quot;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;We can compare the result’s to R’s built-in linear model function, &lt;code&gt;lm&lt;/code&gt; by stacking the data from the two “sites” and running a linear model on the full data set.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;# stack the data frames vertically for later verification
sim.data3 &amp;lt;- as.data.frame(rbind(sim.data1, sim.data2)) 
print(&amp;quot;Base R linear model on the full data set:&amp;quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [1] &amp;quot;Base R linear model on the full data set:&amp;quot;&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;lm(y~x, data=sim.data3)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## 
## Call:
## lm(formula = y ~ x, data = sim.data3)
## 
## Coefficients:
## (Intercept)            x  
##       7.776        2.008&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Pretty similar!&lt;/p&gt;
&lt;p&gt;Click &lt;a href=&#34;https://rextester.com/TDCUWC73705&#34;&gt;here&lt;/a&gt; to run the code.&lt;/p&gt;
&lt;/div&gt;
</description>
    </item>
    
    <item>
      <title>Factors Associated With Acute Pain Estimation,  Postoperative Pain Resolution,  Opioid Cessation,  and Recovery</title>
      <link>http://emcramer.github.io/publication/hah-2019/</link>
      <pubDate>Fri, 01 Mar 2019 00:00:00 +0000</pubDate>
      <guid>http://emcramer.github.io/publication/hah-2019/</guid>
      <description>&lt;script type=&#34;text/javascript&#34; src=&#34;https://d1bxh8uas1mnw7.cloudfront.net/assets/embed.js&#34;&gt;&lt;/script&gt;&lt;div class=&#34;altmetric-embed&#34; data-badge-type=&#34;donut&#34; data-altmetric-id=&#34;56321663&#34; /&gt;
</description>
    </item>
    
    <item>
      <title>Predicting the Incidence of Pressure Ulcers in the Intensive Care Unit Using Machine Learning</title>
      <link>http://emcramer.github.io/publication/cramer-2019/</link>
      <pubDate>Tue, 01 Jan 2019 00:00:00 +0000</pubDate>
      <guid>http://emcramer.github.io/publication/cramer-2019/</guid>
      <description></description>
    </item>
    
    <item>
      <title>The somatic distribution of chronic pain and emotional distress utilizing the collaborative health outcomes information registry (CHOIR) bodymap</title>
      <link>http://emcramer.github.io/publication/cramer-2018/</link>
      <pubDate>Thu, 01 Mar 2018 00:00:00 +0000</pubDate>
      <guid>http://emcramer.github.io/publication/cramer-2018/</guid>
      <description></description>
    </item>
    
  </channel>
</rss>