- ~science

9 votes

Posted November 19, 2018 by unknown user

Topic deleted by author

5 comments

[4]
nathan
November 19, 2018
Link
Unfortunately this is is a hard question to answer without more details. What kind of data? What kind of predictions? How much data? How much accuracy? What are your resources? Do you want to...
- Exemplary
Unfortunately this is is a hard question to answer without more details. What kind of data? What kind of predictions? How much data? How much accuracy? What are your resources? Do you want to learn theory? Do you want to just plug your data into a framework? Is the data cleaned already?

Without much more detail pretty much all I can recommend is “An Introduciton to Statistical Learning” this book is a good overview of the field (without much emphasis on the current ML craze)

4 votes
1. [3]
  Emerald_Knight
  November 19, 2018
  Link Parent
  Inventory management. Items in, items out. Projecting inventory needs based on rate of depletion. At the moment? Fairly little (currently at 116 data points). This is only 3 weeks of data from a...
  
  What kind of data?
  
  Inventory management. Items in, items out.
  
  What kind of predictions?
  
  Projecting inventory needs based on rate of depletion.
  
  How much data?
  
  At the moment? Fairly little (currently at 116 data points). This is only 3 weeks of data from a very limited subset of clients, however, and is projected to grow quite quickly.
  
  How much accuracy?
  
  It doesn't need to be super accurate in the short term. A basic curve roughly fit to the data is perfectly acceptable and a margin of error would be accounted for.
  
  What are your resources?
  
  I'm not quite sure what you're asking here. Monetary? Computational? Labor/expertise? Tech stack?
  
  Do you want to learn theory? Do you want to just plug your data into a framework?
  
  I would prefer to take some time to learn the theory. I've never been one to make use of a tool without first understanding what that tool is doing--that's just a headache waiting to happen.
  
  Is the data cleaned already?
  
  If I understand correctly, cleaning data is essentially just a matter of ensuring accuracy, right? If so, then yes, the data is already clean. Adjustments have been made to records as necessary on a rolling basis, and these adjustments have been rare and exceptional in nature.
  
  Overall, I'm fine with a heuristic rather than a perfect model. When a more complex and accurate model is needed, a dedicated data scientist will likely take over developing that model. I'm just laying down some initial infrastructure and getting something workable in place that can be modified as needed. Understanding the fundamentals of the subject is the important step for me to figure out how that infrastructure should look.
  
  4 votes
  1. [2]
    nathan
    November 19, 2018
    Link Parent
    All great info. Here’s what I’m thinking off the top of my head. You’ll need a dedicated datastore if you want to be able to support advanced models, you don’t want to run expensive analytics...
    
    All great info.
    
    Here’s what I’m thinking off the top of my head.
    
    You’ll need a dedicated datastore if you want to be able to support advanced models, you don’t want to run expensive analytics against a production database.
    
    You’ll need some environment for your model to be used by other people I’m assuming, you don’t want your data scientist to have to worry about this so this is something you should work out now.
    
    Don’t worry about 1 so much right now, I would just load the data in excel and create a basic regression for a first model, it will almost certainly be more accurate than you think, that will give you an actual model to work with while you worry about 2. If you don’t want to bother with the excel your programming language of choice will almost certainly have some basic regression models in a library somewhere. SciKit for python is a great beginner option.
    
    If you want to learn theory that book I recommended is a good introduction at the undergraduate level, if you have the math chops there’s a graduate level book “Elements of Statistical Learning” by the same authors.
    
    3 votes
    
    Emerald_Knight
    November 19, 2018
    Link Parent
    Thanks for the tips! I'll be sure to take a look at that book :)
    
    Thanks for the tips! I'll be sure to take a look at that book :)
    
    1 vote
nfultz
January 22, 2019
Link
ISL & ESL are great recommendations for an intro/deep dive into data science from the stats perspective at the undergrad / graduate level respectively. Since you specifically mentioned predicting...

ISL & ESL are great recommendations for an intro/deep dive into data science from the stats perspective at the undergrad / graduate level respectively.

Since you specifically mentioned predicting inventory, I just wanted to also mention Rob Hyndman's Forecasting book - which is written at an MBA level and might be more directly relevant to business problems. The corresponding R package is also fantastic.

Facebook's prophet package for R and python is also quite nice for getting up and running with forecasting. There is both a quick-start and whitepaper available.

As far as common gotchas, watch out for oddities in your time dimension - February will skew low (and should!) because it's three days shorter, for example, compared to an analysis at the weekly total would look normal. Know your business cycle.

2 votes