Machine learning (ML) tools, especially deep neural networks (DNNs) have garnered significant attention in the last decade; however, it is not clear whether ML tools can learn the inherent characteristics of dynamical model (such as conservation laws) from the training data set. This paper considers the effectiveness of DNNs in learning dynamical system models by considering the Keplerian two-body problem. Training a DNN with data from a single revolution produces poor performance when predicting motion on subsequent revolutions. By incorporating deviations from constancy of angular momentum and total energy into the loss function for the DNN, predictive performance improves significantly. Further improvements appear when a richer training data set (generated from a number of orbits with different in orbital element values) is employed.