I use machine learning as a tool for automatically predicting data. Because of that, I end up learning about other peoples' work.
They mention things like Sparse Coding, Kalman Filter, Convolutional Neural Network, Recurrent Neural Network, etc.
At a very high level, I can understand what these things mean. Sparse Coding is a way to get "better features." Kalman Filter is a way to combine the results of different feature sets / machine learning pipelines on time series data. The neural networks can be trained (or untrained) to classify data.
I want to know how something works on a deeper level when I use it, otherwise I feel dissatisifed and "stupid". This page on Sparse Coding makes absolutely no sense to me in terms of the math: http://ufldl.stanford.edu/wiki/index.php/Sparse_Coding . It's not that it doesn't make sense, it's just my math is not up to speed.
Kalman Filter has a giant Wiki page: https://en.wikipedia.org/wiki/Kalman_filter. It seems like it would take a very long time to get deep level understanding of these things.
Is there any way I can get a strong foundation in this area in a way that will let me intuitively understand these newer approaches without taking years of courses? I feel like I'm a 3rd grader trying to read and understand Shakespeare. Or is it even worth the time, considering all the machine learning software people develop so us 3rd graders don't need to know about the math.
Can't keep up with all the math behind machine learning
Moderators: gmalivuk, Moderators General, Prelates
Re: Can't keep up with all the math behind machine learning
I'm sort of in the same position as you, selfstudying ML. I don't have much pedagogical knowledge, but my approach is to pick up a standard textbook (Bishop, Mitchell, or Murphy), and work my way through it. I skip through things I find boring if i want, until I have to come back to it to understand something else. I think a systematic approach is necessary because I've never had much success with learning isolated topics in areas of math/cs that I don't already know.
You mentioned your math was not quite up to speed. I had this problem too, so I just ended up following all the derivations in the book carefully until i understood anything. I don't think there's any substitute to just learning the math.
Since machine learning is a wide field, I'm also not attempting to know everything. For me, it's enough to know the basics of the popular algorithms, and then only dive deeper into areas which are more interesting to me (for now, CNNs).
If you just need to use machine learning, I would think one can get 95% of the way there without knowing the details of the algorithms. It's probably sufficient just to know the advantages and disadvantages of each algorithm, how to avoid overfitting, trading variance and bias, and then just to call the appropriate library function that someone else wrote.
Also from my experience I've learned that wikipedia and encyclopediatype pages are usually a bad place to start learning, they usually serve better as references to refresh memory.
You mentioned your math was not quite up to speed. I had this problem too, so I just ended up following all the derivations in the book carefully until i understood anything. I don't think there's any substitute to just learning the math.
Since machine learning is a wide field, I'm also not attempting to know everything. For me, it's enough to know the basics of the popular algorithms, and then only dive deeper into areas which are more interesting to me (for now, CNNs).
If you just need to use machine learning, I would think one can get 95% of the way there without knowing the details of the algorithms. It's probably sufficient just to know the advantages and disadvantages of each algorithm, how to avoid overfitting, trading variance and bias, and then just to call the appropriate library function that someone else wrote.
Also from my experience I've learned that wikipedia and encyclopediatype pages are usually a bad place to start learning, they usually serve better as references to refresh memory.

 Posts: 388
 Joined: Tue Aug 02, 2011 9:45 pm UTC
 Location: No we don't have polar bears. Except in zoos.
Re: Can't keep up with all the math behind machine learning
In my experience, there's still no royal route to understanding mathematics. Especially if the understanding is what you desire, you have to study the material, within the framework of formal courses at university or not. Most of the advanced ML stuff is more or less mathheavy graduate level, so that's how the textbooks are written (the necessary theorems and other background might be listed in an appendix, but the assumption often is that student has spent considerable amount of time studying those topics before).
However, in my experience the ride is smoother if you have a good command of prerequisites.
For example, Kalman filter is fundamentally a Bayesian dynamical system model. It's probably easier to "get" what happens if you have worked out some "easier" stuff on probability, Bayesian statistics, stochastic processes and linear dynamics first and don't have to teach yourself all that "at once".
The canonical ML/computational statistics curriculum often covers (most of) the following topics before starting the actual ML stuff, in approximate order starting from "core" and developing to more advanced stuff:
Basic Discrete Mathematics
Single Variable calculus (sometimes called Real Analysis, if proofheavy)
Linear Algebra
Multivariable/Vector Calculus
Graphs
Basic Probability
Some Frequentist Statistics
More Bayesian Statistics
Optimization algorithms
Stochastic Processes and Markov Chains
Computer Science programmes will have more data structures / algorithms / theory of computation courses that are not exactly necessary for the ML part. Likewise, a mathematician might know about stuff about linear operator theory which is interesting but not that necessary, unlike probability and vector stuff that are needed to understand notation and derivations and proofs.
However, in my experience the ride is smoother if you have a good command of prerequisites.
For example, Kalman filter is fundamentally a Bayesian dynamical system model. It's probably easier to "get" what happens if you have worked out some "easier" stuff on probability, Bayesian statistics, stochastic processes and linear dynamics first and don't have to teach yourself all that "at once".
The canonical ML/computational statistics curriculum often covers (most of) the following topics before starting the actual ML stuff, in approximate order starting from "core" and developing to more advanced stuff:
Basic Discrete Mathematics
Single Variable calculus (sometimes called Real Analysis, if proofheavy)
Linear Algebra
Multivariable/Vector Calculus
Graphs
Basic Probability
Some Frequentist Statistics
More Bayesian Statistics
Optimization algorithms
Stochastic Processes and Markov Chains
Computer Science programmes will have more data structures / algorithms / theory of computation courses that are not exactly necessary for the ML part. Likewise, a mathematician might know about stuff about linear operator theory which is interesting but not that necessary, unlike probability and vector stuff that are needed to understand notation and derivations and proofs.
Re: Can't keep up with all the math behind machine learning
jacques01 wrote:I want to know how something works on a deeper level when I use it, otherwise I feel dissatisifed and "stupid". This page on Sparse Coding makes absolutely no sense to me in terms of the math: http://ufldl.stanford.edu/wiki/index.php/Sparse_Coding . It's not that it doesn't make sense, it's just my math is not up to speed.
It doesn't help that that webpage refers to spanning sets as overcomplete bases, but the first section is basic linear algebra. The probabilistic interpretation is more complicated, but still fairly basic stats, at least to understand if not to generate oneself. The problem is that undergraduate mathematics such as linear algebra is considered absolutely fundamental for a reason: it is.
There are no shortcuts in mathematics: after you have done a lot of it, you can start to get rough ideas about how something works without having to read the detail, but this is intuition and experience, like I did from reading the sparse coding article you linked to. I've never read about machine learning before, and I didn't follow every single line of the probabilistic interpretation, but I now have a reasonable working knowledge of what the problem is and how to solve it. That's because I've spent half of my life studying and later working with mathematics on a deep level.
If you point to a house and ask someone how it was built, most people would have no idea, but if you ask a builder, they would have a good idea of what went into it, even if they couldn't immediately reconstruct the whole house themselves on the spot.
Who is online
Users browsing this forum: chridd and 7 guests