Skip to main content
main-content
Top

About this book

Machine Learning Projects for .NET Developers shows you how to build smarter .NET applications that learn from data, using simple algorithms and techniques that can be applied to a wide range of real-world problems. You’ll code each project in the familiar setting of Visual Studio, while the machine learning logic uses F#, a language ideally suited to machine learning applications in .NET. If you’re new to F#, this book will give you everything you need to get started. If you’re already familiar with F#, this is your chance to put the language into action in an exciting new context.

In a series of fascinating projects, you’ll learn how to:

Build an optical character recognition (OCR) system from scratchCode a spam filter that learns by exampleUse F#’s powerful type providers to interface with external resources (in this case, data analysis tools from the R programming language)Transform your data into informative features, and use them to make accurate predictionsFind patterns in data when you don’t know what you’re looking forPredict numerical values using regression modelsImplement an intelligent game that learns how to play from experience

Along the way, you’ll learn fundamental ideas that can be applied in all kinds of real-world contexts and industries, from advertising to finance, medicine, and scientific research. While some machine learning algorithms use fairly advanced mathematics, this book focuses on simple but effective approaches. If you enjoy hacking code and data, this book is for you.

Table of Contents

Chapter 1. 256 Shades of Gray

Building a Program to Automatically Recognize Images of Numbers
Abstract
If you were to create a list of current hot topics in technology, machine learning would certainly be somewhere among the top spots. And yet, while the term shows up everywhere, what it means exactly is often shrouded in confusion. Is it the same thing as “big data,” or perhaps “data science”? How is it different from statistics? On the surface, machine learning might appear to be an exotic and intimidating specialty that uses fancy mathematics and algorithms, with little in common with the daily activities of a software engineer.
Mathias Brandewinder

Chapter 2. Spam or Ham?

Automatically Detect spam in Text Using Bayes’ Theorem
Abstract
If you use email (I suspect you do!), chances are that you see machine learning at work on a daily basis. Your email client probably includes some spam filter mechanism, which automatically identifies blatantly unwanted promotional materials in your incoming messages and discreetly sends them into oblivion in a "spam folder." It spares you the annoyance of having to delete these messages manually one by one; it will also save you from inadvertently clicking potentially harmful links. This is typical of how machine learning, when well done, can make human life better. Computers are great at performing repetitive tasks and being thorough about it; by automatically taking care of tedious activities and saving us from mistakes, they enable us humans to focus on more interesting, thought-engaging activities.
Mathias Brandewinder

Chapter 3. The Joy of Type Providers

Finding and Preparing Data, from Anywhere
Abstract
Let me let you in on the dirty little secret of machine learning: If you were to look solely at the topics publicly discussed, you would think that most of the work revolves around crafting fancy algorithms, the remaining time being spent on designing ways to run these algorithms in a distributed fashion, or some other similarly interesting engineering challenge. The sad truth is that this is a rather marginal part of the job; most of your time will likely be spent on a much more prosaic activity: data janitorial tasks. If you want the machine to learn anything, you need to feed it data, and data has a way of coming from all sorts of different sources, in shapes and formats that are impractical for what you want to do with it. It usually has missing information, and is rarely properly documented. In short, finding and preparing data is both hugely important for machine learning and potentially a source of great pain.
Mathias Brandewinder

Chapter 4. Of Bikes and Men

Fitting a Regression Model to Data with Gradient Descent
Abstract
So far, the problems we have tackled have involved classifying items between a limited set of possible categories. While classification has applications in many practical situations, an arguably more common problem is to predict a number. Consider, for instance, the following task: Given the characteristics of a used car (age, miles, engine size, and so forth), how would you go about predicting how much it is going to sell for? This problem doesn't really fit the pattern of classification. What we need here is a model that differs from classification models in at least two aspects:
Mathias Brandewinder

Chapter 5. You Are Not a Unique Snowflake

Detecting Patterns with Clustering and Principle Components Analysis
Abstract
Sometimes, your day of machine learning begins with a very specific problem to solve. Can you figure out if an incoming email is spam? How accurately can you predict sales? You have been given your marching orders: You start with a question, look for what data is available, and get to work building the best model you can to answer that question.
Mathias Brandewinder

Chapter 6. Trees and Forests

Making Predictions from Incomplete Data
Abstract
One of my nieces' favorite games is a guessing game. One person thinks of something, and the other player tries to figure out what that something is by asking only yes or no questions. If you have played this game before, you have probably seen the following pattern in action: first ask questions that eliminate large categories of possible answers, such as "Is it an animal?", and progressively narrow down the focus of the questions as you gather more information. This is a more effective strategy than asking from the get-go, say, "Did you think about a zebra?" On the one hand, if the answer is "yes," you are done, as there is only one possible answer. On the other hand, if the answer is "no," you are in a rather bad spot, having learned essentially nothing.
Mathias Brandewinder

Chapter 7. A Strange Game

Learning from Experience with Reinforcement Learning
Abstract
Imagine you are a creature in the middle of a large room. The floor is covered with colorful tiles everywhere you look, for as far as you can see. You feel adventurous and take a step forward onto a blue tile. Zing! You feel a burst of pain. Maybe blue tiles are bad? On your left is a red tile, on your right a blue tile. Let’s try red this time. Tada! This time, good things happen. It would seem red tiles are good, and blue are bad. By default, you should probably avoid blue tiles and prefer red ones. Or maybe things are slightly more complicated, and what matters is the particular configurations of tiles. There is only one way to know—trial and error. Try things out, confirm or invalidate hypotheses, and in general, do more of the things that seem to work, and less of the ones that seem to fail.
Mathias Brandewinder

Chapter 8. Digits, Revisited

Abstract
In Chapter 1, we explored together the digit recognizer problem, and wrote a classifier from scratch. In this final chapter, we are going to revisit that problem, from two different angles: performance, and useful tools. Chapters 1 to 7 were primarily focused on implementing algorithms to solve various problems, and discovering machine learning concepts in the process. By contrast, this chapter is more intended as a series of practical tips which can be useful in various situations. We will use the digit recognizer model we created in Chapter 1 as a familiar reference point, and use it to illustrate techniques that are broadly applicable to other situations.
Mathias Brandewinder

Chapter 9. Conclusion

Abstract
You made it through this book—congratulations! This was a long journey, and I hope it was also an enjoyable one—one from which you picked up an idea or two along the way. Before we part ways, I figured it might be worthwhile to take a look back at what we have accomplished together, and perhaps also see if there are some broader themes that apply across the chapters, in spite of their profound differences.
Mathias Brandewinder
Additional information