top of page

OBJECTIVE - 2 

Predicting the number of prescriptions based on specialty, credential, gender, etc. 

Zero-inflated count models assume that the observations originate either from a “susceptible” population that generates zero and positive counts according to a count distribution or from a “non-susceptible” population, which produces additional zeros [1, 2]. Thus, while a subject with a positive count is considered to belong to the “susceptible” population, individuals with zero counts may belong to either of the two latent populations. We denote the observed values of the response variable as y=(y1,y2,...yn)’. Following Lambert [2], a ZIP mixture distribution can be written as 

Picture3.png

Approach: Lasso Penalized ZIP Model 
 

Reason 

The coefficients of some less contributive variables are forced to be exactly zero. Only the most significant variables are kept in the final model. It shrinks the regression coefficients toward zero by penalizing the regression model with a penalty, which is the sum of the absolute coefficients. 

  • If λ = 0, We get the same coefficient as the linear regression. 

  • If λ is very large, all coefficients are shrunk towards zero (λ value of penalty parameter). 

As an initial model try to fit a Poisson Regression Model. Which is used to predict a dependent variable that consists of count data given one more independent variable. Since we are dealing with count data having an excess of zeros and the data did not follow the Poisson distribution (Mean = Variance), Here the data having a larger variance as compared to the mean. So, we cannot proceed with the Poisson regression model. So as a next approach we go with Zero Inflated Poisson Model, Zero Inflated Negative Binomial Model, and Conway-Maxwell (COM – Poisson) Model. Zero Inflated Poisson model is used to model count data that have an excess of zeros, Zero Inflated Negative Binomial Model is used to model count variable with an excess of zeros and it is for overdispersed count outcome variable and the Conway-Maxwell (COM – Poisson) Model is used for modeling overdispersed or under dispersed counts. But these models are not convergent to the data, because of the larger variance and excess of zeros. So, we move to the next method using Linear Regression Model modeling the relationship between a scalar response and one or more explanatory variables. In the case of the linear model, applied power transformation, find and removed the outlier effect from the data but the data is not normal, and the model only gives the maximum adjusted r-square is 37%. Which means that the model explains only 37% of the total variance. Which is very low. So as a next step we proceed with Lasso Penalized ZIP Model.As we have tried different models like Linear model, Linear Mixed Model, Ridge Regression, Ridge Random Effect Model, Ridge Regression (Removing Variables), Ridge Random Effect Model(Removing Variables) the best model we have obtained by comparing the AIC(Akaike information criterion:  an estimator of out-of-sample prediction error and thereby the relative quality of statistical models for a given set of data. Given a collection of models for the data, AIC estimates the quality of each model, relative to each of the other models). The table is shown below: 

Output 

We used the lasso penalized zero-inflated Poisson model for finding the variables which have an effect on the prediction of the number of prescriptions. Our data has 368 variables, by using this model we can break down into 96 variables which have a high impact on the prediction. Those 96 variables are given below. 

APRN, CNP, DO, FNP, MD, MHS, MS, NP, PAC, PH, RN, RPAC, FENTANYL, HYDROCODONE.ACETAMINOPHEN, METHADONE.HCL, MORPHINE.SULFATE.ER, OXYCODONE.ACETAMINOPHEN, OXYCODONE.HCL, TRAMADOL.HCL, M, Anesthesiology, Family Practice, Geriatric Medicine, Internal Medicine, Interventional Pain Management, Oral Surgery (dentists only), Orthopedic Surgery, Osteopathic Manipulative Medicine, Pain Management, Physical Medicine and Rehabilitation, Rheumatology, AL, CA, FL, GA, IA, IN, KY, MI, MO, MS, NC, OH, OK, PR, SC, TN, TX, VA, ANP, APNC, APRN-BC, ARNPC, CRNP, FNPBC, FNPC, PHYSICIAN, General Practice, Neuropsychiatry, Registered Nurse, AR, LA, ARNP, FACEP, MB, RNCNP, WHNP, Hematology, SD, WY, ARANP, CNMFNP, APRNRN, RNFNP, Neuromusculoskeletal Medicine, Sports Medicine, ANPBC, Nurse Practitioner, CNS, RNCANP, Sleep Medicine, PHARMD, Oral & Maxillofacial Surgery, GNP-BC, NPF, RNP, RNCS, DP, PHARM, RNMSN, RPA, Clinic/Center, DDSMPH, DENTIST, Certified Clinical Nurse Specialist, Certified Nurse Midwife, Community Health Worker.

Zero Inflated Poisson Model 

After finding the significant variables, we have fitted a zero-inflated Poisson model for predicting the number of prescriptions. The Zero Inflated Poisson model is generally used to model the count data that have an excess of zeros. The estimates and standard errors from the model are given in the table below. 

Picture42.png
Picture41.png
Picture44.png
Picture43.png

Code 

  • Availability of data, materials, and code are upon request.

Conclusion 

  • From this model, we can observe that nurse practitioners and physician assistants are prescribing more opioid drugs as compared to others. Some of the previous articles from the National Library of Medicine(USA) state the same (Trends in Opioid Use and Prescribing in Medicare, 2006-2012). 

  • While we consider the field gender, only males show the impact in prescribing opioid drugs.   

  • In the case of states, it shows that 22 states have impact on predicting the number of prescriptions. 

bottom of page