To get started with NLTK, you will need to get the library up and running on your machine. In this tutorial, you will learn
- How to install NLTK in Windows
- How to install python in Windows
- How to install NLTK in Mac or Linux
- How to install NLTK from Anaconda
- How to download NLTK dataset
- How to download all NLTK packages
- How to run NLP script
- How to run NLTK script
How to Install NLTK in Windows
You will first need to have Python on your machine. So we will start by installing Python. If you have Python already installed on your machine, you can skip this step.
How to Install Python in Windows
- Go to the python official website and navigate to the download menu. You can easily do this by click on this link https://www.python.org/downloads/
- Select the Python version you wish to download. You may want to download the latest version to access the updated packages
You can access other versions from the listed python releases by version number
- Once you click on the download button, the download commences
- Once the download is complete, open the downloaded file to begin the installation process. Click on install now
- Click on Next till the installation completes.
- Navigate to the Scripts folder in Python and copy the file path
- Open command prompt, paste the path there and run this code and wait for the installation to be complete
pip3 install nltk
- You can confirm it successfully installed by opening the Python shell and running this code:
import nltk.
If it runs without error, NLTK has been installed successfully on your PC.
How to Install NLTK for Mac/Linux
Unlike in Windows, Python is preinstalled in Linux/Max systems, although the older machines may have Python 2. But to install NLTK in Linux/Mac, the same python’s pip installer is used. To do this, open your command prompt and type the commands below.
- First, it is good practice to update the package index. To do this, type the command
sudo apt update
- To install another version of Python in a Linux system, type
sudo apt-get install python3.7
- To install pip for python 3, type in
sudo apt install python-pip
pip install -U pip
pip install –upgrade pip
- Once pip is installed, you can now install NLTK using the following command
sudo pip install -U nltk
sudo pip3 install -U nltk
How to Install NLTK from Anaconda
- First, you will need to have Anaconda installed on your system. Visit their website and click on the download menu. You could as well use this link and click on download https://www.anaconda.com/products/individual#download-section
- Once you successfully download the file open the .exe file to install it.
- Once the installation is complete, open Anaconda Prompt and end this command
conda install -c anaconda nltk
- Accept package updates by typing ‘yes’ and wait for the package to get downloaded.
How to download NLTK dataset
NLTK module has some datasets that will come in handy when doing your projects. You can call these datasets a corpus. Examples of such corpora include stop words, framenet_v15, large_grammers, and many more. To download these datasets, follow these steps
- On your Python shell, run this command
import nltk
nltk.download()
- An NLTK Downloader dialog box pops up. Click on the download bottom to download the dataset.
- To check it has successfully download, run this command
from nltk import corpus
If it runs without error, it is successfully installed.
How to Run NLTK Script
There are many modules and functions in the NLTK library. Let’s just run a simple tokenize function on a chunk of text
from nltk import tokenize text = "Wow, I am almost done with this tutorial" print(tokenize.word_tokenize(text)) Output: ['Wow', ',', 'I', 'am', 'almost', 'done', 'with', 'this', 'tutorial']
Going forward in this tutorial series, we will discuss different operations the NLTK library is capable of doing. In the meantime, congratulations on writing your first NLTK script.