作者原文如下:
Dear members,
There has been some confusion regarding the processing of workunits and point credit for returned workunits. I will try to clarify how the current system works and mention briefly how the upgraded system will work in regards to this.
Currently, workunits are credited as long as the job is still in an "active" state. Once the job is changed from an "active" state, any outstanding workunits will not be accepted and no point credit can be given. This is not new behavior, but some of the work management (e.g. new Rosetta jobs) we have been doing lately may make you think so.
For the past many many months, the Cancer jobs have been in an "active" state and any workunits could always be uploaded and credit recieved anytime. Since the data is not new and final results have been calculated, all but one Cancer job have been disabled. Anyone still crunching any of those other workunits would not have received credit once the job was marked inactive. Please understand that we cannot let jobs run forever.
The Rosetta jobs we have been running have been finishing much more quickly than we had expected (very good). In order to not crunch redundant workunits and to move on to new data, we have been disabling the Rosetta jobs as soon as they finish and we have been starting a new one with new data immediately. This would mean that any dispatched workunits that were still outstanding would not be credited after the job was marked "inactive".
We now realize that this may be causing some frustration and will tweak the Rosetta jobs to minimize this as follows.
1) We will limit the number of concurrent dispatches per workunit. This should limit the number of outstanding dispatched workunits at any given time.
2) We have set the wallclock timeout to three days. Any workunit that takes more than three real days to complete will be discarded.
3) Once we have all of the required number of results per workunit and the job is complete, we will wait the three day wall clock limit before marking the job "inactive". This should greatly limit the number of invalid workunits since we are limiting how many active dispatches there can be.
Note that there will still be some cases where there are dispatched workunits that will not be returned before the job is marked "inactive" and no credit will be given. Also note that once we move to the latest version of the grid software, this will no longer be an issue. When we have all of the results, the job will be marked "suspended" which means no more workunits will be dispatched, but we will still credit any returned workunits. Unfortunately, that same behavior just does not exist in the version we are currently running.
I know there is frustration in the user community due to perceived "work for nothing", but we are burning through Rosetta jobs in just a few days which is amazing and something everyone should be proud of. We still have a couple of snags getting the new Cancer data uploaded, but I know we will see similarly amazing results.
Please hang in there and thank you for contributing.
大致上是說,因為舊的Rosetta已經差不多算完了,但是有些機器因為利用udtweaker的關係
仍在運算舊的資料,而官方不可能讓這些機器無窮無盡的算下去
所以他們決定,在Rosetta整個結束之後一段時間內,舊的Rosetta如果不回傳,官方就不再接受他的分數
換句話說,不管是有新的Rosetta,或是整個Rosetta已經結束,請大家趕快將手上已經算完的Rosetta回傳
|