I would suggest that would be an expected behavior. Geoprocessing services are rarely easy tasks and therefore require significant processing power. A 'simple' task would consume an instance of the service for the period of time required to generate the response. That instance will consume up to a full processor to generate a response as quickly as possible. This is key when you are deciding what to expose, how to optimize it, and how to configure your services (instances/pooling/hardware requirements).
I currently have a geoprocessing service which generates results for a table in a viewer. The task queries a table and returns 15 numbers. A previous version of this service took only six seconds to respond. This was fine on a good day, but under high load (200+ users at a 60s refresh interval) it led to server overload and timeouts. Now at 1.2s and 120s refresh interval that failure threshold has been pushed out.
In the past I have worked with a report generation task. While the initial requirements appeared simple it turned into a mammoth 2-3hr task. You can obviously not generate many of these simultaneously on a server with 4 cores and still reasonably host maps (generate four at a time and the server can't do anything else for hours). It was deployed to a single user behind a password with the expectation it would be run once a week. Doing it again it wouldn't be developed as a GP service...
Using techniques like limiting the size or resolution of prints, and limiting who can print could be methods you can use to help limit the performance hit you take from printing. Asynchronous jobs with long timeouts to allow prints to queue could help, emailing the print to the user when it is complete rather than forcing them to wait and stare at a hourglass.