Deploying a PyTorch Model Locally

Photo by Benjamin Sow on Unsplash
The 5 layers of our DCGAN, translating an input vector into a a 3 channel image
optimizerG.step()
  • Forward — this pass takes input and calculates an output
  • Backward — this steps through the neural network, updating the parameters of the neural network. This is the important step to training — without this, the neural network doesn’t learn.
# Calculate G's loss based on this output
errG = criterion(output, label)
# Calculate gradients for G
errG.backward()
Generator’s state dictionary:
main.0.weight torch.Size([100, 512, 4, 4])
main.1.weight torch.Size([512])
main.1.bias torch.Size([512])
main.1.running_mean torch.Size([512])
main.1.running_var torch.Size([512])
main.1.num_batches_tracked torch.Size([])
main.3.weight torch.Size([512, 256, 4, 4])
main.4.weight torch.Size([256])
main.4.bias torch.Size([256])
main.4.running_mean torch.Size([256])
main.4.running_var torch.Size([256])
main.4.num_batches_tracked torch.Size([])
main.6.weight torch.Size([256, 128, 4, 4])
main.7.weight torch.Size([128])
main.7.bias torch.Size([128])
main.7.running_mean torch.Size([128])
main.7.running_var torch.Size([128])
main.7.num_batches_tracked torch.Size([])
main.9.weight torch.Size([128, 64, 4, 4])
main.10.weight torch.Size([64])
main.10.bias torch.Size([64])
main.10.running_mean torch.Size([64])
main.10.running_var torch.Size([64])
main.10.num_batches_tracked torch.Size([])
main.12.weight torch.Size([64, 3, 4, 4])
Optimizer’s state dictionary:
state {0: {‘step’: 7915, ‘exp_avg’: tensor([[[[-8.4379e-03, -1.6962e-03, 2.0153e-03, -2.3263e-03],
[ 9.3564e-03, 2.8464e-03, -1.4765e-03, -5.5706e-03],
[-5.1847e-03, -2.0944e-03, 3.0152e-03, 1.6600e-03],
[ 3.3945e-03, 2.1832e-03, -7.7789e-03, 1.3872e-03]],
[[ 1.3178e-04, 5.9188e-04, -3.4215e-03, 2.8209e-03],
[ 3.4224e-03, 6.9770e-03, 5.4208e-05, 3.6286e-03],
[-1.9070e-03, -3.6697e-03, 7.6668e-04, 6.0801e-03],
[-8.1516e-04, 8.2781e-04, -1.6558e-03, -1.4505e-03]],
[[-3.2616e-03, 1.8192e-04, 4.1123e-03, -2.3757e-03],
[ 2.2800e-04, 2.8144e-03, -4.4250e-03, 1.5944e-03],
[ 2.1202e-03, -1.6148e-03, 1.3747e-03, 2.0928e-03],
[ 9.3368e-04, 3.6862e-03, -2.0053e-03, 2.0065e-03]],
…,[[-4.8926e-03, -1.9031e-03, -6.0549e-04, -3.5536e-03],
[ 1.8643e-03, 1.0315e-04, -4.2139e-04, 5.9912e-04],
[-6.5301e-03, -3.1602e-03, -2.5134e-03, 2.4062e-04],
[-6.7217e-03, -2.8157e-03, -4.1863e-03, 6.1009e-04]],
[[-4.3491e-03, -4.4880e-03, 2.9170e-03, -1.8324e-03],
[ 5.1785e-03, -2.6050e-03, -7.0698e-04, 6.6632e-04],
[-2.8962e-03, -3.9267e-03, -1.3776e-05, 3.3616e-03],
[-2.6761e-03, -1.0645e-02, 4.2122e-04, -6.2535e-04]],
[[ 1.4652e-04, 4.9615e-03, -3.5904e-03, -2.4742e-03],
  • Save the state dictionary — this will save the parameters. For training purposes later, you will want to save the optimizer for the particular net as well. This means that for DCGAN, you’ll want to save both the Discriminator, Generator, and their respective optimizers. (You would still need some way to save batch number, epoch, and other training based checkpoints.) According to the PyTorch documentation, this is the recommended method for Inference. This method is best used if you are going to use PyTorch Python to generate inference, or want to compare different models later on. I expect that sometime after PyTorch v1.6 the rest of the ecosystem will catch up with this method.
  • Next, we have the save of the entire model. This saves the entire model using Python’s pickle. Pickle essentially writes an object to disk (we call this serializing it). The model is actually bound to an exact data path, so if you do later move things around, they can easily break. This is best used by Azure if you are going to register the model, download the model, deploy the model elsewhere using PyTorch Android, ONXX, etc.
  • Lastly, we have a checkpoint model. This one is handy to resume training with later — it saves any parameter you tell it to in a handy way, so you can load it later. When working with Azure, this is best used if you plan to continue training later, and are just taking a break. (For instance, you’re going to hit your 21 day limit on your pay-as-you-go Azure account.)
torch.save(model, os.path.join(args.output, 'dcgan.pt'))
for file in run.get_file_names():
print("File:" + str(file))
OSError: [Errno 30] Read-only file system: '/tmp/tmpbgzzc7lr/real.png'
  • ./logs — these logs are uploaded real time, so you can view them right from the portal while the run is going, just like the other Azure default logs.
  • ./outputs — these files are added to the run as artifacts. So they get logged as part of your experiment history.
Cool — there’s our finalized model from PyTorch!
File:azureml-logs/55_azureml-execution-tvmps_bdf2d5554aaea8ecb28f826c58cd5da88f16c5a9a4b27fecfef7edda79dcb6c8_d.txt
File:azureml-logs/65_job_prep-tvmps_bdf2d5554aaea8ecb28f826c58cd5da88f16c5a9a4b27fecfef7edda79dcb6c8_d.txt
File:azureml-logs/70_driver_log.txt
File:azureml-logs/process_info.json
File:azureml-logs/process_status.json
File:logs/azureml/97_azureml.log
File:logs/azureml/dataprep/backgroundProcess.log
File:logs/azureml/dataprep/backgroundProcess_Telemetry.log
File:logs/azureml/dataprep/engine_spans_50c08491–8d59–4cf0–81f6–43aac45d3e6c.jsonl
File:logs/azureml/dataprep/engine_spans_cb744712–614b-4295–9815–5cbdaac9fc46.jsonl
File:logs/azureml/dataprep/python_span_50c08491–8d59–4cf0–81f6–43aac45d3e6c.jsonl
File:logs/azureml/dataprep/python_span_cb744712–614b-4295–9815–5cbdaac9fc46.jsonl
File:logs/azureml/job_prep_azureml.log
File:outputs/real.png
File:outputs/sample0.png
File:outputs/sample320.png
File:outputs/sample365.png
File:outputs/sample410.png
File:outputs/sample455.png
File:outputs/sample500.png
File:outputs/sample544.png
cp /mnt/c/Users/<user id>/Downloads/dcgan.pt nets/
  • Add in our Generator structure
  • Set up the variables that the model needs to run
  • Load the model
  • Send a vector into the model
  • Write the resulting picture out as a file
netG = torch.load(args.net_path)
AttributeError: Can't get attribute 'Generator' on <module '__main__' from 'run-dcgan.py'>
class Generator(nn.Module):
def __init__(self, ngpu):
super(Generator, self).__init__()
self.ngpu = ngpu
self.main = nn.Sequential(
# input is Z, going into a convolution
nn.ConvTranspose2d(nz, ngf * 8, 4, 1, 0, bias=False),
<snip>
nn.Tanh()
# state size. (nc) x 64 x 64
)
RuntimeError: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False. If you are running on a CPU-only machine, please use torch.load with map_location=torch.device(‘cpu’) to map your storages to the CPU.
device = torch.device(“cuda:0” if (torch.cuda.is_available() and ngpu > 0) else “cpu”)netG = torch.load(args.net_path, map_location=device)
netG.eval()
fixed_noise = torch.randn(64, nz, 1, 1, device=device)
# generate a new Generator using the fixed_noise.
fake = netG(fixed_noise)
# put some better borders between the images in our output - we'll have an array of 64 of them.
output=vutils.make_grid(fake, padding=2, normalize=True)
# save the output in faces.png
vutils.save_image(output, os.path.join(args.output, "faces.png"), normalize=True)
A single image
A full grid of them!

--

--

--

Years of technology experience have given me a unique perspective on many things, including parenting, climate change, etc. Or maybe I’m just opinionated.

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

Human Activity Recognition using ML

Learning to Play Doom from Demonstrations

DoomMyWayHome total reward

Deep Learning on the DigitalOcean Stack? Not Quite Yet

A Basic Guide to Transfer Learning Concepts in Deep Learning

Aggregated View Object Detection (AVOD) for Sensor Fusion of Lidar and Camera in Autonomous Driving.

COMPSCI 4ML3: An Honest Review

Predicting the Geospatial Availability of Mobility Services like Bird and Lime

MMOCR: A Comprehensive Toolbox for Text Detection, Recognition and Understanding

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Allan Graves

Allan Graves

Years of technology experience have given me a unique perspective on many things, including parenting, climate change, etc. Or maybe I’m just opinionated.

More from Medium

Using the MiniFrag Database to validate your SMARTS strings

Bulk Boto3 (bulkboto3): Python package for fast and parallel transferring a bulk of files to S3…

Explain any rule-based model using game theory

Machine Learning is hard. Make it easier with Aero