dlib C++ Library - Frequently Asked Questions

I compiled dlib's Python interface with CUDA enabled, why isn't it using CUDA?

Either you are using a part of dlib that just doesn't use CUDA, of which there are many parts, or you are mistaken about compiling dlib with CUDA enabled. In particular, many users report that "dlib isn't using CUDA even though I definitely compiled it with CUDA" and in every case either they are not using part of dlib that uses CUDA or they have installed multiple copies of dlib on their computer, some with CUDA disabled, and they are using a non-CUDA build.

You can check if dlib is compiled to use CUDA by looking at the dlib.DLIB_USE_CUDA boolean. If dlib.DLIB_USE_CUDA is false then you didn't compile it with CUDA enabled, but if it's true then dlib is using all the CUDA it's going to use.

[top]

Why is dlib.image_window missing from the Python module?

If you are getting the error module 'dlib' has no attribute 'image_window' it is because you compiled dlib without GUI support (or you are using a copy of dlib someone else compiled and they built it without GUI support). So note that it is possible to compile dlib with any GUI tools. Some people want to do this because they run dlib on systems that don't have any kind of GUI framework installed.

But since you are reading this you obviously want to use the GUI tools. The solution is to get a copy of dlib and python setup.py install it yourself. It's easy. Note that you might get a warning in the build output about X11 not being installed. Maybe that's why you are getting this error in the first place. READ THAT MESSAGE AND FOLLOW ITS INSTRUCTIONS since it tells you what to do to fix this.

[top]

Why is some function missing from the dlib Python module?

If you are missing dlib.image_window then read the FAQ about that. If you are missing any other function then it's because you are using an old version of dlib that just doesn't have that function. You need to install a newer version of dlib. Please don't post questions about this on any of dlib's forums or email me about it. Just install a new dlib. The only way to use features in a new version of dlib is to get the new version of dlib. Often people think they have the new version of dlib installed when really they have some old version installed. You can see what version of dlib you are using by checking dlib.__version__.

[top]

How can I cite dlib?

If you use dlib in your research then please use the following citation:

Davis E. King. Dlib-ml: A Machine Learning Toolkit. Journal of Machine Learning Research 10, pp. 1755-1758, 2009


@Article{dlib09,
  author = {Davis E. King},
  title = {Dlib-ml: A Machine Learning Toolkit},
  journal = {Journal of Machine Learning Research},
  year = {2009},
  volume = {10},
  pages = {1755-1758},
}

[top]

How can I use dlib in Visual Studio?

First, note that you need a version of Visual Studio with decent C++11 support. This means you need Visual Studio 2015 or newer.

There are instructions on the How to Compile page. If you do not understand the instructions in the "Compiling on Windows Using Visual Studio" section or are getting errors then follow the instructions in the "Compiling on Any Operating System Using CMake" section. In particular, install CMake and then type these exact commands from within the root of the dlib distribution:

cd examples
mkdir build
cd build
del /F /S /Q *
cmake ..
cmake --build . --config Release

That should compile the dlib examples in visual studio. The output executables will appear in the Release folder. The del /F /S /Q * command is to make sure you clear out any extraneous files you might have placed in the build folder and is not necessary if build begins empty.

[top]

How do I install/compile dlib?

Follow the official instructions. They tell you exactly what to type to use dlib.

[top]

How do I set the size of a matrix at runtime?

Long answer, read the matrix example program.

Short answer, here are some examples:

matrix<double> mat;
mat.set_size(4,5);

matrix<double,0,1> column_vect;
column_vect.set_size(6);

matrix<double,0,1> column_vect2(6);  // give size to constructor

matrix<double,1> row_vect;
row_vect.set_size(5);

[top]

How does dlib interface with other libraries/tools?

There should never be anything in dlib that prevents you from using or interacting with other libraries. Moreover, there are some additional tools in dlib to make some interactions easier:

BLAS and LAPACK libraries are used by the matrix automatically if you #define DLIB_USE_BLAS and/or DLIB_USE_LAPACK and link against the appropriate library files. Note that the CMakeLists.txt file that comes with dlib will do this for you automatically in many instances.

Armadillo and Eigen libraries have matrix objects which can be converted into dlib matrix objects by calling dlib::mat() on them.

OpenCV image objects can be converted into a form usable by dlib routines by using cv_image. You can also convert from a dlib matrix or image to an OpenCV Mat using dlib::toMat().

Google Protocol Buffers can be serialized by the dlib serialization routines. This means that, for example, you can pass protocol buffer objects through a bridge.

libpng and libjpeg are used by load_image whenever DLIB_PNG_SUPPORT and DLIB_JPEG_SUPPORT are defined respectively. You must also tell your compiler to link against these libraries to use them. However, CMake will try to link against them automatically if they are installed.

SQLite is used by the database object. In fact, it is just a wrapper around SQLite's C interface which simplifies its use (e.g. makes resource management use RAII).

[top]

It doesn't work?

Do not post a question like "I'm using dlib, and it doesn't work?" or "I'm using the object detector and it doesn't work, what do I do?". If this is all you say then I have no idea what is wrong. 99% of the time it's some kind of user error. 1% of the time it's some problem in dlib. But again, without more information it's impossible to know. So please don't post questions like this.

If you think you found some kind of bug or problem in dlib then feel free to submit a dlib issue on github. But include the version of dlib you are using, what you are trying, what happened, what you expected to have happened instead, etc.

On the other hand, if you haven't found a bug or problem in dlib, but instead are looking for machine learning/computer vision/programming help then post your question to stack overflow with the dlib tag.

[top]

Where is the documentation for <object/function>?

Every class and function in dlib is documented in detail. If you can't find something then check the index.

Also, the bulk of the documentation can be found by clicking the More Details... buttons. So you should click on the "more details" buttons and read the documentation.

A lot of people post questions like "There is no documentation for some_random_function(), how do I use it?", when in reality the function is documented in detail. Between the index, site search, and main website which breaks down functions/classes into topical categories there is no excuse for not being able to find the documentation for a function or class. This is especially true if you know its name because you can jump right to it using the index or even a simple google search. So if you are posting a question like "I don't understand how something works" and obviously haven't read the documentation then you are just going to get referred to this FAQ. So please read the documentation before asking questions.

[top]

Why do I get USER_ERROR__inconsistent_build_configuration__see_dlib_faq_1?

You are getting this error because you either forgot to link to dlib, or are not compiling all the C++ code in your program with consistent settings. The latter is wrong because it is a violation of C++'s One Definition Rule. In this case, you are compiling some translation units with dlib's assert macros enabled and others with them disabled.

For reference, the code that generates this error is: dlib/test_for_odr_violations.h and dlib/test_for_odr_violations.cpp.

[top]

Why do I get USER_ERROR__inconsistent_build_configuration__see_dlib_faq_2?

You are getting this error because you are not compiling all the C++ code in your program with consistent settings. This is a violation of C++'s One Definition Rule. In this case, you compiled a standalone copy of dlib with CMake and instead of using make install or cmake --build . --target install to copy the resulting build files somewhere you went and cherry picked files manually and messed it up. In particular, CMake compiled dlib with a bunch of settings recorded in the CMake generated config.h file but you instead are now trying to build more dlib related code with the dlib/config.h from source control.

For reference, the code that generates this error is: dlib/test_for_odr_violations.h and dlib/test_for_odr_violations.cpp.

Finally, most users who get this error are using Visual Studio. You probably compiled dlib and then went into Visual Studio's output folder, grabbed the .lib file, and then tried to create a project using that .lib file and dlib's .h files from github. THIS IS WRONG, DO NOT DO THIS. Instead, read the instructions for using dlib and follow them. I promise they are much simpler than any process that involves manually copying files around in the file explorer.

[top]

Why is dlib slow?

Dlib isn't slow. I get this question many times a week and 95% of the time it's from someone using Visual Studio who has compiled their program in Debug mode rather than the optimized Release mode. So if you are using Visual Studio then realize that Visual Studio has these two modes. The default is Debug. The mode is selectable via a drop down:

Debug mode disables compiler optimizations. So the program will be very slow if you run it in Debug mode. So click the drop down,

and select Release.

Then when you compile the program it will appear in a folder named Release rather than in a folder named Debug.

Finally, you can enable either SSE4 or AVX instruction use. These will make certain operations much faster (e.g. face detection). You do this using CMake's cmake-gui tool. For example, if you execute these commands you will get the cmake-gui screen:

cd examples
mkdir build
cd build
cmake .. 
cmake-gui .

Which looks like this:

Where you can select SSE4 or AVX instruction use. Then you click configure and then generate. After that when you build your visual studio project some things will be faster. Finally, note that AVX is a little bit faster than SSE4 but if your computer is fairly old it might not support it. In that case, either buy a new computer or use SSE4 instructions.

[top]

Why isn't serialization working?

Here are the possibilities:

You are using a file stream and forgot to put it into binary mode. You need to do something like this:
```
std::ifstream fin("myfile", std::ios::binary);
```
or
```
std::ofstream fout("myfile", std::ios::binary);
```
If you don't give std::ios::binary then the iostream will mess with the binary data and cause serialization to not work right.

The iostream is in a bad state. You can check the state by calling mystream.good(). If it returns false then the stream is in an error state such as end-of-file or maybe it failed to do the I/O. Also note that if you close a file stream and reopen it you might have to call mystream.clear() to clear out the error flags.

[top]

Can you give advice on feature generation/kernel selection?

Picking the right kernel all comes down to understanding your data, and obviously this is highly dependent on your problem.

One thing that's sometimes useful is to plot each feature against the target value. You can get an idea of what your overall feature space looks like and maybe tell if a linear kernel is the right solution. But this still hides important information from you. For example, imagine you have two diagonal lines which are very close together and are both the same length. Suppose one line is of the +1 class and the other is the -1 class. Each feature (the x or y coordinate values) by itself tells you almost nothing about which class a point belongs to but together they tell you everything you need to know.

On the other hand, if you know something about the data you are working with then you can also try and generate your own features. So for example, if your data is a bunch of images and you know that one of your classes contains a lot of lines then you can make a feature that attempts to measure the number of lines in an image using a hough transform or sobel edge filter or whatever. Generally, try and think up features which should be highly correlated with your target value. A good way to do this is to try and actually hand code N solutions to the problem using whatever you know about your data or domain. If you do a good job then you will have N really great features and a linear or rbf kernel will probably do very well when using them.

Or you can just try a whole bunch of kernels, kernel parameters, and training algorithm options while using cross validation. I.e. when in doubt, use brute force :) There is an example of that kind of thing in the model selection example program.

[top]

How can I define a custom kernel?

See the Using Custom Kernels example program.

[top]

Why does my decision_function always give the same output?

This happens when you use the radial_basis_kernel and you set the gamma value to something highly inappropriate. To understand what's happening lets imagine your data has just one feature and its value ranges from 0 to 7. Then what you want is a gamma value that gives nice Gaussian bumps like the one in this graph:

However, if you make gamma really huge you will get this (it's zero everywhere except for one place):

Or if you make gamma really small then it will be 1.0 everywhere:

So you need to pick the gamma value so that it is scaled reasonably to your data. A good rule of thumb (i.e. not the optimal gamma, just a heuristic guess) is the following:

const double gamma = 1.0/compute_mean_squared_distance(randomly_subsample(samples, 2000));

[top]

Why is cross_validate_trainer_threaded() crashing?

This function makes a copy of your training data for each thread. So you are probably running out of memory. To avoid this, use the randomly_subsample function to reduce the amount of data you are using or use fewer threads.

For example, you could reduce the amount of data by saying this:

// reduce to only 1000 samples
cross_validate_trainer_threaded(trainer, 
                                randomly_subsample(samples, 1000), 
                                randomly_subsample(labels,  1000), 
                                4,   // num folds
                                4);  // num threads

[top]

Why is RVM training is really slow?

The optimization algorithm is somewhat unpredictable. Sometimes it is fast and sometimes it is slow. What usually makes it really slow is if you use a radial basis kernel and you set the gamma parameter to something too large. This causes the algorithm to start using a whole lot of relevance vectors (i.e. basis vectors) which then makes it slow. The algorithm is only fast as long as the number of relevance vectors remains small but it is hard to know beforehand if that will be the case.

You should try kernel ridge regression instead since it also doesn't take any parameters but is always very fast.

[top]

Why doesn't the object detector I trained work?

There are three general mistakes people make when trying to train an object detector with dlib.

Not labeling all the objects in each image
The tools for training object detectors in dlib use the Max-Margin Object Detection loss. This loss optimizes the performance of the detector on the whole image, not on some subset of windows cropped from the training data. That means it counts the number of missed detections and false alarms for each of the training images and tries to find a way to minimize the sum of these two error metrics. For this to be possible, you must label all the objects in each training image. If you leave unannotated objects in some of your training images then the loss will think any detections on these unannotated objects are false alarms, and will therefore try to find a detector that doesn't detect them. If you have enough unannotated objects, the most accurate detector will be the one that never detects anything. That's obviously not what you want. So make sure you annotate all the objects in each image.
Sometimes annotating all the objects in each image is too onerous, or there are ambiguous objects you don't care about. In these cases you should annotate these objects you don't care about with ignore boxes so that the MMOD loss knows to ignore them. You can do this with dlib's imglab tool by selecting a box and pressing i. Moreover, there are two ways the code treats ignore boxes. When a detector generates a detection it compares it against any ignore boxes and ignores it if the boxes "overlap". Deciding if they overlap is based on either their intersection over union or just basic percent coverage of one by another. You have to think about what mode you want when you annotate things and configure the training code appropriately. The default behavior is to use intersection over union to measure overlap. However, if you wanted to simply mask out large parts of an image you wouldn't want to use intersection over union to measure overlap since small boxes contained entirely within the large ignored region would have small IoU with the big ignore region and thus not "overlap" the ignore region. In this case you should change the settings to reflect this before training. The available configuration options are discussed in great detail in parts of dlib's documentation.
Using training images that don't look like the testing images
This should be obvious, but needs to be pointed out. If there is some clear difference between your training and testing images then you have messed up. You need to show the training algorithm real images so it can learn what to do. If instead you only show it images that look obviously different from your testing images don't be surprised if, when you run the detector on the testing images, it doesn't work. As a rule of thumb, a human should not be able to tell if an image came from the training dataset or testing dataset.
Here are some examples of bad datasets:
- A training dataset where objects always appear with some specific orientation but the testing images have a diverse set of orientations.
- A training dataset where objects are tightly cropped, but testing images that are uncropped.
- A training dataset where objects appear only on a perfectly white background with nothing else present, but testing images where objects appear in a normal environment like living rooms or in natural scenes.
Another way you can mess this up is when using the random_cropper to jitter your training data, which is common when training a CNN or other deep model. In general, the random_cropper finds images that are more or less centered on your objects of interest and it also scales the images so the object has some user specified minimum size. That's all fine. But what can happen is you train a model that gets 0 training error but when you go and use it it doesn't detect any objects. Why is that? It's probably because all the objects in your normal images, the ones you give to the random_cropper, are really small. Smaller than the min size you told the cropper to make them. So now your testing images are really different from your training images. Moreover, in general object detectors have some minimum size they scan and if objects are smaller than that they will never be found. Another related issue is all your uncropped images might show objects at the very border of the image. But the random_cropper will center the objects in the crops, by padding with zeros if necessary. Again, make your testing images look like the training images. Pad the edges of your images with zeros if needed.
Using a HOG based detector but not understanding the limits of HOG templates
The HOG detector is very fast and generally easy to train. However, you have to be aware that HOG detectors are essentially rigid templates that are scanned over an image. So a single HOG detector isn't going to be able to detect objects that appear in a wide range of orientations or undergo complex deformations or have complex articulation.
For example, a HOG detector isn't going to be able to learn to detect human faces that are upright as well as faces rotated 90 degrees. If you wanted to deal with that you would be best off training 2 detectors. One for upright faces and another for 90 degree rotated faces. You can efficiently run multiple HOG detectors at once using the evaluate_detectors function, so it's not a huge deal to do this. Dlib's imglab tool also has a --cluster option that will help you split a training dataset into clusters that can be detected by a single HOG detector. You will still need to manually review and clean the dataset after applying --cluster, but it makes the process of splitting a dataset into coherent poses, from the point of view of HOG, a lot easier.
A related issue arises because HOG is a rigid template, which is that the boxes in your training data need to all have essentially the same aspect ratio. For instance, a single HOG filter can't possibly detect objects that are both 100x50 pixels and 50x100 pixels. To do this you would need to split your dataset into two parts, objects with a 2:1 aspect ratio and objects with a 1:2 aspect ratio and then train two separate HOG detectors, one for each aspect ratio.
However, it should be emphasized that even using multiple HOG detectors will only get you so far. So at some point you should consider using a CNN based detection method since CNNs can generally deal with arbitrary rotations, poses, and deformations with one unified detector.

[top]

Why can't I change the network architecture at runtime?

A major design goal of this API is to let users create new loss layers, computational layers, and solvers without needing to understand or even look at the dlib internals. A lot of the API decisions are based on what makes the interface a user needs to implement to create new layers as simple as possible. In particular, designing the API in this compile-time static way makes it simple for these use cases.

Here is an example of one problem it addresses. Since dlib exposes the entire network architecture to the C++ type system we can get automatic serialization of networks. Without this, we would have to resort to the kind of hacky global layer registry used in other tools that compose networks entirely at runtime.

Another nice feature is that we get to use C++11 alias template statements to create network sub-blocks, which we can then use to easily define very large networks. There are examples of this in this example program. It should also be pointed out that it takes days or even weeks to train one network. So it isn't as if you will be writing a program that loops over large numbers of networks and trains them all. This makes the time needed to recompile a program to change the network irrelevant compared to the entire training time. Moreover, there are plenty of compile time constructs in C++ you can use to enumerate network architectures (e.g. loop over filter widths) if you really wanted to do so.

All that said, if you think you found a compelling use case that isn't supported by the current API feel free to post a github issue.

[top]

Why can't I use the DNN module with Visual Studio?

You can, but you need to use Visual Studio 2015 Update 3 or newer since prior versions had bad C++11 support. To make this as confusing as possible, Microsoft has released multiple different versions of "Visual Studio 2015 Update 3". As of October 2016, the version available from the Microsoft web page has good enough C++11 support to compile the DNN tools in dlib. So make sure you have a version no older than October 2016.

To make this even more complicated, Visual Studio 2017 had regressions in its C++11 support. So all versions of Visual Studio 2017 prior to December 2017 would just hang if you tried to compile the DNN examples. Happily, the newest versions of Visual Studio 2017 appear to have good C++11 support and will compile the DNN codes without any issue. So make sure your Visual Studio is fully updated.

Finally, it should be noted that you should give the -T host=x64 cmake option when generating a Visual Studio project. If you don't do this then you will get the default Visual Studio toolchain, which runs the compiler in 32bit mode, restricting it to 2GB of RAM, leading to compiler crashes due to it running out of RAM in some cases. This isn't the 1990s anymore, so you should probably run your compiler in 64bit mode so it can use your computer's RAM. Giving -T host=x64 will let Visual Studio use as much RAM as it needs.

Computer Vision

Deep Learning

General

Machine Learning

Python

I compiled dlib's Python interface with CUDA enabled, why isn't it using CUDA?

Why is dlib.image_window missing from the Python module?

Why is some function missing from the dlib Python module?

How can I cite dlib?

How can I use dlib in Visual Studio?

How do I install/compile dlib?

How do I set the size of a matrix at runtime?

How does dlib interface with other libraries/tools?

It doesn't work?

Where is the documentation for <object/function>?

Why do I get USER_ERROR__inconsistent_build_configuration__see_dlib_faq_1?

Why do I get USER_ERROR__inconsistent_build_configuration__see_dlib_faq_2?

Why is dlib slow?

Why isn't serialization working?

Can you give advice on feature generation/kernel selection?

How can I define a custom kernel?

Why does my decision_function always give the same output?

Why is cross_validate_trainer_threaded() crashing?

Why is RVM training is really slow?

Why doesn't the object detector I trained work?

Not labeling all the objects in each image

Using training images that don't look like the testing images

Using a HOG based detector but not understanding the limits of HOG templates

Why can't I change the network architecture at runtime?

Why can't I use the DNN module with Visual Studio?