# Overview ```$xslt _ _ | | (_) ___ _ _ | |__ _ __ ___ __ _ _ __ _ _ __ ___ / __|| | | || '_ \ | '_ ` _ \ / _` || '__|| || '_ \ / _ \ \__ \| |_| || |_) || | | | | || (_| || | | || | | || __/ |___/ \__,_||_.__/ |_| |_| |_| \__,_||_| |_||_| |_| \___| ? ~~~~~~~~~~~~~~~~~~~~~~~~~~~|^"~~~~~~~~~~~~~~~~~~~~~~~~~o~~~~~~~~~~~ o | o __o o | o |X__> ___o | __o (X___>-- __|__ |X__> o | \ __o | \ |X__> _______________________|_______\________________ < \____________ _ \ \ (_) \ O O O >=) \__________________________________________________________/ (_) ``` Submarine is a project which allows infra engineer / data scientist to run *unmodified* Tensorflow or PyTorch programs on YARN or Kubernetes. Goals of Submarine: - It allows jobs easy access data/models in HDFS and other storages. - Can launch services to serve Tensorflow/PyTorch models. - Support run distributed Tensorflow jobs with simple configs. - Support run user-specified Docker images. - Support specify GPU and other resources. - Support launch tensorboard for training jobs if user specified. - Support customized DNS name for roles (like tensorboard.$user.$domain:6006) Please jump to [QuickStart](src/site/markdown/QuickStart.md) guide to quickly understand how to use this framework. Please jump to [Examples](src/site/markdown/Examples.md) to try other examples like running Distributed Tensorflow Training for CIFAR 10.