PyTorch for NLP: Basic Concepts and Operations

This blog teaches you how to create and manipulate tensors, the fundamental data structure of PyTorch, for natural language processing tasks.

Table of Contents

1. Introduction

PyTorch is a popular open-source framework for deep learning and natural language processing (NLP). It provides a flexible and expressive way to define, train, and deploy neural networks. One of the core features of PyTorch is its support for tensors, which are multidimensional arrays that can store and manipulate numerical data.

In this tutorial, you will learn the basic concepts and operations of tensors in PyTorch. You will learn how to create tensors from different sources, how to manipulate tensors using various methods and functions, and how to perform common operations on tensors such as addition, multiplication, and broadcasting. By the end of this tutorial, you will have a solid understanding of how to work with tensors in PyTorch for NLP tasks.

Before you start, make sure you have PyTorch installed on your system. You can follow the official installation guide here. You will also need a Python interpreter and an IDE or a text editor of your choice. For this tutorial, we will use Jupyter Notebook as our IDE, but you can use any other tool that suits your preference.

Ready to dive into tensors? Let’s get started!

2. What are Tensors?

Tensors are the fundamental data structure of PyTorch. They are multidimensional arrays that can store and manipulate numerical data of various types and shapes. Tensors are similar to NumPy arrays, but they have some additional features that make them suitable for deep learning and NLP.

One of the main features of tensors is that they can be used on different devices, such as CPUs and GPUs. This allows you to leverage the power of parallel computing and speed up your computations. You can easily move tensors from one device to another using the to() method.

Another feature of tensors is that they can track the operations that are performed on them. This enables you to perform automatic differentiation and backpropagation, which are essential for training neural networks. You can control whether a tensor requires gradient or not by setting the requires_grad attribute.

Tensors can also interact with other PyTorch components, such as datasets, dataloaders, optimizers, and models. These components provide convenient and efficient ways to load, process, optimize, and evaluate your data and models. You will learn more about them in the later sections of this tutorial.

Now that you have a general idea of what tensors are, let’s see how you can create them in PyTorch.

3. How to Create Tensors in PyTorch

There are many ways to create tensors in PyTorch. You can use various functions and methods provided by the torch module, or you can convert other data types, such as lists, NumPy arrays, or Pandas dataframes, into tensors. In this section, you will learn some of the most common and useful ways to create tensors in PyTorch.

One of the simplest ways to create a tensor is to use the torch.tensor() function. This function takes a data argument, which can be a list, a tuple, a NumPy array, or any other array-like object, and returns a tensor with the same data and shape. For example, you can create a one-dimensional tensor (also called a vector) from a list of numbers as follows:

# Import torch module
import torch

# Create a one-dimensional tensor from a list
x = torch.tensor([1, 2, 3, 4, 5])
print(x)
# Output: tensor([1, 2, 3, 4, 5])

You can also create a two-dimensional tensor (also called a matrix) from a nested list of numbers as follows:

# Create a two-dimensional tensor from a nested list
y = torch.tensor([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
print(y)
# Output: tensor([[1, 2, 3],
#                 [4, 5, 6],
#                 [7, 8, 9]])

You can specify the data type of the tensor by passing the dtype argument to the torch.tensor() function. PyTorch supports various data types, such as torch.float32, torch.int64, torch.bool, and more. For example, you can create a tensor of type torch.float32 as follows:

# Create a tensor of type torch.float32
z = torch.tensor([1.0, 2.0, 3.0], dtype=torch.float32)
print(z)
# Output: tensor([1., 2., 3.])

Another way to create a tensor is to use the torch.from_numpy() function. This function takes a NumPy array as an argument and returns a tensor that shares the same data and memory location with the NumPy array. This means that any changes made to the NumPy array will also affect the tensor, and vice versa. For example, you can create a tensor from a NumPy array as follows:

# Import numpy module
import numpy as np

# Create a NumPy array
a = np.array([1, 2, 3, 4, 5])

# Create a tensor from the NumPy array
b = torch.from_numpy(a)
print(b)
# Output: tensor([1, 2, 3, 4, 5])

# Change the value of the NumPy array
a[0] = 10

# Check the value of the tensor
print(b)
# Output: tensor([10,  2,  3,  4,  5])

PyTorch also provides many other functions to create tensors with specific values or shapes. For example, you can use the torch.zeros() function to create a tensor filled with zeros, the torch.ones() function to create a tensor filled with ones, the torch.rand() function to create a tensor with random values from a uniform distribution, and the torch.randn() function to create a tensor with random values from a normal distribution. You can also use the torch.arange() function to create a tensor with a range of values, and the torch.reshape() function to change the shape of a tensor. Here are some examples of using these functions:

# Create a tensor of shape (3, 3) filled with zeros
c = torch.zeros(3, 3)
print(c)
# Output: tensor([[0., 0., 0.],
#                 [0., 0., 0.],
#                 [0., 0., 0.]])

# Create a tensor of shape (2, 2) filled with ones
d = torch.ones(2, 2)
print(d)
# Output: tensor([[1., 1.],
#                 [1., 1.]])

# Create a tensor of shape (4, 4) with random values from a uniform distribution
e = torch.rand(4, 4)
print(e)
# Output: tensor([[0.8149, 0.9508, 0.3174, 0.8781],
#                 [0.1822, 0.2128, 0.9708, 0.8660],
#                 [0.4310, 0.8075, 0.9003, 0.6147],
#                 [0.4879, 0.5449, 0.2347, 0.4206]])

# Create a tensor of shape (4, 4) with random values from a normal distribution
f = torch.randn(4, 4)
print(f)
# Output: tensor([[ 0.2219, -0.1730, -0.0750, -0.7891],
#                 [-0.4570, -0.3438, -0.1219, -0.1593],
#                 [-0.2227, -0.1773,  0.4466, -0.0919],
#                 [-0.5129, -0.3008, -0.1791,  0.3151]])

# Create a tensor with values from 0 to 9
g = torch.arange(10)
print(g)
# Output: tensor([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

# Reshape the tensor g to have shape (2, 5)
h = torch.reshape(g, (2, 5))
print(h)
# Output: tensor([[0, 1, 2, 3, 4],
#                 [5, 6, 7, 8, 9]])

These are some of the most common and useful ways to create tensors in PyTorch. You can find more information and examples in the official documentation here. In the next section, you will learn how to manipulate tensors in PyTorch using various methods and functions.

4. How to Manipulate Tensors in PyTorch

Once you have created tensors in PyTorch, you can manipulate them using various methods and functions. You can perform arithmetic operations, such as addition, subtraction, multiplication, and division, on tensors of the same or compatible shapes. You can also apply mathematical functions, such as logarithm, exponentiation, and trigonometry, on tensors element-wise. In addition, you can perform linear algebra operations, such as matrix multiplication, transpose, inverse, and determinant, on tensors of appropriate dimensions. In this section, you will learn some of the most common and useful ways to manipulate tensors in PyTorch.

One of the simplest ways to manipulate tensors is to use the built-in operators, such as +, -, *, and /, to perform arithmetic operations on tensors. These operators are overloaded to work with tensors of the same or compatible shapes. For example, you can add two tensors of the same shape as follows:

# Create two tensors of shape (2, 2)
x = torch.tensor([[1, 2], [3, 4]])
y = torch.tensor([[5, 6], [7, 8]])

# Add the two tensors
z = x + y
print(z)
# Output: tensor([[ 6,  8],
#                 [10, 12]])

You can also multiply two tensors of compatible shapes using the * operator. This performs element-wise multiplication, which means that each element of the first tensor is multiplied by the corresponding element of the second tensor. For example, you can multiply two tensors of shape (2, 2) as follows:

# Multiply the two tensors
w = x * y
print(w)
# Output: tensor([[ 5, 12],
#                 [21, 32]])

If you want to perform matrix multiplication, which is a common operation in linear algebra and deep learning, you can use the @ operator or the torch.matmul() function. This requires that the number of columns of the first tensor matches the number of rows of the second tensor. For example, you can multiply a tensor of shape (2, 3) by a tensor of shape (3, 2) as follows:

# Create two tensors of shape (2, 3) and (3, 2)
a = torch.tensor([[1, 2, 3], [4, 5, 6]])
b = torch.tensor([[7, 8], [9, 10], [11, 12]])

# Perform matrix multiplication
c = a @ b
print(c)
# Output: tensor([[ 58,  64],
#                 [139, 154]])

PyTorch also provides many other functions to manipulate tensors using mathematical and linear algebra operations. For example, you can use the torch.log() function to compute the natural logarithm of a tensor, the torch.exp() function to compute the exponential of a tensor, and the torch.sin() function to compute the sine of a tensor. You can also use the torch.t() function to transpose a tensor, the torch.inverse() function to compute the inverse of a tensor, and the torch.det() function to compute the determinant of a tensor. Here are some examples of using these functions:

# Create a tensor of shape (2, 2)
d = torch.tensor([[1, 2], [3, 4]], dtype=torch.float32)

# Compute the natural logarithm of the tensor
e = torch.log(d)
print(e)
# Output: tensor([[0.0000, 0.6931],
#                 [1.0986, 1.3863]])

# Compute the exponential of the tensor
f = torch.exp(d)
print(f)
# Output: tensor([[ 2.7183,  7.3891],
#                 [20.0855, 54.5982]])

# Compute the sine of the tensor
g = torch.sin(d)
print(g)
# Output: tensor([[ 0.8415,  0.9093],
#                 [ 0.1411, -0.7568]])

# Transpose the tensor
h = torch.t(d)
print(h)
# Output: tensor([[1., 3.],
#                 [2., 4.]])

# Compute the inverse of the tensor
i = torch.inverse(d)
print(i)
# Output: tensor([[-2.0000,  1.0000],
#                 [ 1.5000, -0.5000]])

# Compute the determinant of the tensor
j = torch.det(d)
print(j)
# Output: tensor(-2.)

These are some of the most common and useful ways to manipulate tensors in PyTorch. You can find more information and examples in the official documentation here. In the next section, you will learn how to slice and index tensors in PyTorch using various methods and functions.

4.1. Slicing and Indexing Tensors

Slicing and indexing tensors are important skills that allow you to access and modify specific elements or regions of a tensor. You can use the same syntax and rules as NumPy arrays to slice and index tensors in PyTorch. In this section, you will learn some of the most common and useful ways to slice and index tensors in PyTorch.

One of the simplest ways to slice and index tensors is to use the square brackets [] and the colon : operators. You can use the square brackets to specify the index or indices of the elements or rows that you want to access or modify. You can use the colon operator to specify a range of indices or a step size. For example, you can access the first element of a one-dimensional tensor as follows:

# Create a one-dimensional tensor
x = torch.tensor([1, 2, 3, 4, 5])

# Access the first element
y = x[0]
print(y)
# Output: tensor(1)

You can also access the last element of a one-dimensional tensor by using the negative index -1 as follows:

# Access the last element
z = x[-1]
print(z)
# Output: tensor(5)

You can use the colon operator to access a slice of a one-dimensional tensor by specifying the start and end indices. For example, you can access the elements from index 1 to index 3 (exclusive) as follows:

# Access a slice of the tensor
w = x[1:3]
print(w)
# Output: tensor([2, 3])

You can also use the colon operator to access a slice of a one-dimensional tensor by specifying the step size. For example, you can access every other element of the tensor as follows:

# Access a slice of the tensor with a step size of 2
v = x[::2]
print(v)
# Output: tensor([1, 3, 5])

You can use the same syntax and rules to slice and index two-dimensional tensors or higher-dimensional tensors. You can use a comma , to separate the indices or slices for each dimension. For example, you can access the element at row 0 and column 1 of a two-dimensional tensor as follows:

# Create a two-dimensional tensor
a = torch.tensor([[1, 2, 3], [4, 5, 6], [7, 8, 9]])

# Access the element at row 0 and column 1
b = a[0, 1]
print(b)
# Output: tensor(2)

You can also access a submatrix of a two-dimensional tensor by specifying the slices for each dimension. For example, you can access the submatrix that contains the elements from row 0 to row 2 (exclusive) and from column 1 to column 3 (exclusive) as follows:

# Access a submatrix of the tensor
c = a[0:2, 1:3]
print(c)
# Output: tensor([[2, 3],
#                 [5, 6]])

PyTorch also provides some other methods and functions to slice and index tensors in more advanced ways. For example, you can use the torch.index_select() function to select specific rows or columns of a tensor by passing a tensor of indices. You can also use the torch.masked_select() function to select elements of a tensor that satisfy a boolean condition. You can find more information and examples in the official documentation here. In the next section, you will learn how to reshape tensors in PyTorch using various methods and functions.

4.2. Reshaping Tensors

Reshaping tensors is another important skill that allows you to change the shape or dimensionality of a tensor without changing its data or memory location. You can reshape tensors to match the input or output requirements of different PyTorch components, such as datasets, dataloaders, models, and optimizers. In this section, you will learn some of the most common and useful ways to reshape tensors in PyTorch.

One of the simplest ways to reshape tensors is to use the torch.reshape() function. This function takes a tensor and a new shape as arguments and returns a new tensor with the same data and the new shape. The new shape must be compatible with the original shape, which means that the product of the dimensions of the new shape must equal the product of the dimensions of the original shape. For example, you can reshape a tensor of shape (2, 3) to a tensor of shape (6, 1) as follows:

# Create a tensor of shape (2, 3)
x = torch.tensor([[1, 2, 3], [4, 5, 6]])

# Reshape the tensor to shape (6, 1)
y = torch.reshape(x, (6, 1))
print(y)
# Output: tensor([[1],
#                 [2],
#                 [3],
#                 [4],
#                 [5],
#                 [6]])

You can also use the torch.view() function to reshape tensors in PyTorch. This function works similarly to the torch.reshape() function, but it has some additional constraints. The torch.view() function can only be used on contiguous tensors, which are tensors that store their data in a contiguous block of memory. If the tensor is not contiguous, you can use the torch.contiguous() method to make it contiguous before calling the torch.view() function. For example, you can reshape a tensor of shape (2, 3) to a tensor of shape (3, 2) as follows:

# Create a tensor of shape (2, 3)
x = torch.tensor([[1, 2, 3], [4, 5, 6]])

# Reshape the tensor to shape (3, 2)
y = x.view(3, 2)
print(y)
# Output: tensor([[1, 2],
#                 [3, 4],
#                 [5, 6]])

PyTorch also provides some other methods and functions to reshape tensors in more advanced ways. For example, you can use the torch.squeeze() function to remove dimensions of size 1 from a tensor, the torch.unsqueeze() function to add dimensions of size 1 to a tensor, and the torch.permute() function to change the order of the dimensions of a tensor. You can find more information and examples in the official documentation here. In the next section, you will learn how to broadcast tensors in PyTorch using various methods and functions.

4.3. Broadcasting Tensors

Broadcasting tensors is a powerful feature that allows you to perform arithmetic operations on tensors of different shapes. Broadcasting automatically expands the smaller tensor to match the shape of the larger tensor, without copying any data or allocating any extra memory. This makes the computation more efficient and convenient. In this section, you will learn some of the most common and useful ways to broadcast tensors in PyTorch.

One of the simplest ways to broadcast tensors is to use the built-in operators, such as +, -, *, and /, to perform arithmetic operations on tensors of different shapes. These operators are overloaded to work with tensors of compatible shapes, which means that the smaller tensor can be expanded along one or more dimensions to match the shape of the larger tensor. For example, you can add a scalar (a tensor of shape ()) to a tensor of shape (2, 2) as follows:

# Create a tensor of shape (2, 2)
x = torch.tensor([[1, 2], [3, 4]])

# Add a scalar to the tensor
y = x + 10
print(y)
# Output: tensor([[11, 12],
#                 [13, 14]])

You can also add a vector (a tensor of shape (n,)) to a matrix (a tensor of shape (m, n)) as follows:

# Create a tensor of shape (2, 3)
x = torch.tensor([[1, 2, 3], [4, 5, 6]])

# Create a tensor of shape (3,)
y = torch.tensor([10, 20, 30])

# Add the two tensors
z = x + y
print(z)
# Output: tensor([[11, 22, 33],
#                 [14, 25, 36]])

In both cases, the smaller tensor is broadcasted to match the shape of the larger tensor by repeating its elements along the missing or singleton dimensions. The result is a tensor of the same shape as the larger tensor, with the arithmetic operation applied element-wise.

PyTorch also provides some other functions to broadcast tensors in more advanced ways. For example, you can use the torch.broadcast_tensors() function to broadcast a sequence of tensors to a common shape, the torch.expand() function to explicitly expand a tensor to a larger shape, and the torch.expand_as() function to expand a tensor to the same shape as another tensor. You can find more information and examples in the official documentation here. In the next and final section, you will learn how to conclude your tutorial and provide some additional resources for the readers.

5. Conclusion

In this tutorial, you have learned the basic concepts and operations of tensors in PyTorch. You have learned how to create tensors from different sources, how to manipulate tensors using various methods and functions, how to slice and index tensors using various techniques, and how to broadcast tensors using various rules and functions. You have also learned how to use code snippets and HTML tags to format your tutorial and make it more readable and engaging.

Tensors are the fundamental data structure of PyTorch, and they are essential for performing deep learning and natural language processing tasks. By mastering the skills of working with tensors, you will be able to build and train powerful neural networks and models using PyTorch. You will also be able to apply your knowledge and skills to other domains and problems that involve numerical data and computations.

We hope you have enjoyed this tutorial and learned something new and useful. If you want to learn more about PyTorch and its features and applications, you can check out the following resources:

The official PyTorch website here, where you can find the latest news, tutorials, documentation, and community forums.
The official PyTorch tutorials here, where you can find a variety of tutorials on different topics and levels, such as beginner, intermediate, advanced, computer vision, natural language processing, generative models, and more.
The official PyTorch examples here, where you can find a collection of high-quality examples of using PyTorch for various tasks and domains, such as image classification, reinforcement learning, machine translation, and more.
The official PyTorch GitHub repository here, where you can find the source code, issues, pull requests, and contributions of PyTorch.

Thank you for reading this tutorial and happy learning!