Tuesday, 25 October 2016

Dealing with compiler problems when installing XGBoost on mac os x 10.12


XGBoost is a sexy library in machine learning, currently performing very well in the last kaggle competitions. This post doesn't intend to describe the machinery of the XGBoost, but rather to relate the issues I faced during the installation of the XGBoost python package. 

The holy pip command


I like very much working with a linux kernel because installing python libs is made easy trough the pip command.
According to the XGBoost main page:

pip  install xgboost

should do the work... but it didn't, at least in my environment. Got the following error:

Obtaining file:///Users/greghor/anaconda2/lib/python2.7/site-packages/xgboost/python-package
    Complete output from command python setup.py egg_info:
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "/Users/greghor/anaconda2/lib/python2.7/site-packages/xgboost/python-package/setup.py", line 19, in <module>
        LIB_PATH = libpath['find_lib_path']()
      File "/Users/greghor/anaconda2/lib/python2.7/site-packages/xgboost/python-package/xgboost/libpath.py", line 46, in find_lib_path
        'List of candidates:\n' + ('\n'.join(dll_path)))
    __builtin__.XGBoostLibraryNotFound: Cannot find XGBoost Library in the candidate path, did you install compilers and run build.sh in root path?
    List of candidates:
    /Users/greghor/anaconda2/lib/python2.7/site-packages/xgboost/python-package/xgboost/libxgboost.so
    /Users/greghor/anaconda2/lib/python2.7/site-packages/xgboost/python-package/xgboost/../../lib/libxgboost.so
    /Users/greghor/anaconda2/lib/python2.7/site-packages/xgboost/python-package/xgboost/./lib/libxgboost.so
    /Users/greghor/anaconda2/xgboost/libxgboost.so
---------------------------------------- Command "python setup.py egg_info" failed with error code 1 in /Users/greghor/anaconda2/lib/python2.7/site-packages/xgboost/python-package/



I started to look at this egg_info error, consequently updating the corresponding components but it didn't fix my issues. 

Manual installation to account for multithreading

The main difficulty when you don't have a background in CS is that all the logs look so cryptic. I must honestly say that I often do debugging via trial and errors without really understanding the underlying logic... this was my approach here and I end up following a tutorial explaining on how installing xgboost manually. 
This approach has the benefit to account for multithreading (which was seemingly not the case if one uses pip install). Following the steps described here seem to work for 99% of the people,  unfortunately not for me.
I was still stuck with the same error. Google told me that the roots of evil were probably related to a problem with the C++ compiler. There are different compilers available out there, I personally use gcc (for C) and g++ (for C++). XGBoost requires to be compiled with updated versions of gcc and g++ (XGBoost didn't work with gcc-4.9 and g++-4.9). While my gcc and g++ were up to date, the compiler still used the old versions as default... and this was the key point. It took me a while to figure this out, but you also need to update the symbolic links pointing to the compiler:

cd bin/usr/
rm gcc
ln -s gcc-6 gcc //set default gcc to gcc-6$
ln -s g++-6 gcc //set default g++ to g++-6

This last step finally fixed my issues. XGBoost is now working like a charm! Kaggle folks, watch out your ass! Here I come!


1 comment:

  1. Modern XGBoost will install directly on macos using pip and will use all cpu cores, for example: pip install xgboost

    It's also a good idea to use a virtual environment when install libraries via pip, for example: https://xgboosting.com/install-xgboost-for-python-on-macos/

    ReplyDelete