What Machine Learning Means For The Oracle DBA
So Much Hype... So Much Reality
Yes, there is a lot of hype about machine learning. But it's real. In this article, I am going to explain why the hype is real.
Plus, I'm going to relate this specifically to Oracle DBAs and more broadly to the Oracle professional.
What Is Machine Learning?
At it's core, Machine Learning (ML) is about understanding data; extracting interesting and useful patterns. But this is done methodically and using a wide variety of algorithms.
From a broader perspective, ML fits under the AI umbrella. AI is a very broad and an obscenely marketed term.
Data Science has changed over the years. It used to be focused on statistics, such as the mean, median and standard deviation. Then we moved into understanding data relationships using regression analysis. Then visualization came along with charting of all kinds.
But now there is more!
Modern data science uses machine learning because it provides algorithms that are able to automatically analyze large data sets to extract potentially interesting and useful patterns. (more about "useful patterns" later)
Without specific and a wide range of algorithms, machine learning would not exist. Here is a short list of some of the more commonly used ML algorithms:
- Support Vector Machines (SVM)
- Decision Tree Learning
- Instance-Based Learning
- Generalized Linear Models
- Artificial Neural Networks
- Clustering (multiple cluster algorithms exist)
How Is ML Used?
If you're new to ML, you may not be aware of where it is used today. The short answer is, EVERYWHERE! It's even kind of creepy.
Machine learning is:
- Used by Alexis and Siri. Ever notice how well they "know you"?
- Minority Report pre-crime is possible. In fact, China is using ML in a pre-crime like way today. Don't believe me? Do a little research and talk to people. It's real and it's scary.
- Fraud Detection. Have you ever received a phone call or text from a credit card company because they "detected unusual account activity"? That's machine learning at work.
- "What if" analysis. Machine learning is taking scenario type analysis of all types to a new level.
But how does this all relate to you, the Oracle DBA, Developer and Manager? Read on...
How Does This Relate To Me, An Oracle Professional?
Any honest Oracle professional should be asking this question. The best way I can answer this, is to ask you a few questions.
Have you ever investigated a performance situation and said:
- What is going on? I've never seen this before! In ML terms, we call this "anomaly detection."
- Oh Oh! We've seen this situation before and it's probably going to turn out bad. In ML terns, we call that "classification."
- Can the production system handle doubling the processes... only in Singapore... only the MFG ABC process... only between 0200 to 0400... In ML terms, that is called, "general regression analysis."
Because in our work, we are faced with "I've never seen that before", "I've seen that before and it's bad" and "What will happen, if we do X?" machine learning can help us do our jobs faster and with more precision than ever before.
You may be wondering why ML is such a big deal since predictive analysis has been around... forever. I wondered that too! Read on...
Why Is Machine Learning different?
Personally, I needed to know why machine learning is different. After all, for many years I have taught classes on Oracle predictive analysis. And, I wrote the book, Forecasting Oracle Performance. So, I have real skin in this game.
So, when all the buzz came out about ML, I needed to know what makes ML different from, for example, the regression analysis I've been using and teaching for many years.
I found the answer.
To use Linear Regression (LR) in a real-life production Oracle environment, I had to solve a bunch of very tricky problems. In fact, I developed a methodology to force myself (and others) to face these nasty challenges head on.
To keep this article short, I'll only mention a couple of these "challenges" and how machine learning directly addresses them.
Nearly all Oracle performance related data is non-linear. Usually numeric statistics do not expose this, so you have to look closely at specific charts to visually notice the non-linearity. But you're not done! You still have to deal with this non-linearity.
Machine Learning embraces non-linearity in a number of ways. Nearly all ML algorithms have ways of inherently dealing with this problem. And even traditional regression analysis gets a boast because ML embraces data transformation!
Simply put, data transformation changes existing data so it works better for a specific algorithm and in many cases, allows the predicted results to be transformed back into it's original form.
For example, suppose a model only worked in celsius. So, if your data is in fahrenheit, before entering the data into the model, you had better transform it into celsius. The model would then do it's magic, returning a prediction in celsius. You would then transform the results from celsius back into fahrenheit.
All ML packages include a wide variety of ways to transform data. Awesome!
Not only is this a very cool sounding word, it enables the Data Scientist to increase predictive precision... and not cheat. Let me explain.
One way to cheat on an exam, is to know the questions that will be asked... before you take the exam. The process is simple. You simply memorize the answer to each question.
You would be amazed at how many forecasts are made based by simply memorizing questions and answers. You don't need a predictive model in this case. Just use SQL with a simple where clause.
The value of predictive models, is their ability to make good predictions based on data the model has never seen before! Just like when you ace a test when you have never seen the questions before!
Part of training a model is exposing it to a mix of data that will lead to the best possible predictive power, without giving the model the final test data.
This is much more difficult than it seems. Machine learning packages can go way beyond simply randomizing and splitting data into training and test data sets. It's super powerful and reduces the likelihood of a bad/biased prediction.
Love It? Care for it? Use it? Embrace It? Explore It!
Yes, there is a lot of hype about machine learning. But it's real. And, it's powerful. And, we as Oracle professionals can use machine learning to do things we have never thought possible.
I am so stoked about empowering Oracle professionals to use machine learning, I re-designed, re-focused and changed the name of my predictive analysis class to, Machine Learning - Performance Engineering 2. I'm that serious about the usefulness of machine learning.
I also am ensuring all OraPub members have an opportunity to learn and use machine learning in their work. I believe it is that important and will provide us all with opportunities that we are just beginning to see.
I Want To End With This
Long ago I remember talking to mainframe systems programers. Many of them thought Unix was kid stuff. Only a short-sighted infantile-minded person would settle for Unix! They said, information technology demands stringent controls and processes, just like an accountant demands debits and credits. It must balance!
What these wonderful folks did not see, is their world was about to be rocked! And instead of being rocked into the future, they where essentially locked into the same job for rest of their careers. I do not want to live like that and I suspect you don't either.
You may be asking, "Why now?" I mean, what has changed? Why didn't people do this stuff years and years ago? Or even five years ago? Excellent questions! And, that's the topic of my next article.
All the best in your Oracle tuning and machine learning work,
If you have any questions or comments, feel free to email me directly at craig at orapub.com.
|Parsing Performance: Going Beyond Cursor Sharing Using Bind Variables||Tool Options: Detailing Oracle Database Process CPU Consumption||Which Is Better; Time Model Or ASH Data?|