Job Robots Troubleshooting

Description

NO_DWLOD Cannot download default.tgz from gsiftp

Also SAM tests were affected. In fact, we can't
uberftp wms213.cern.ch
530-Login incorrect. : globus_gss_assist: Error invoking callout
530-globus_callout_module: The callout returned an error
530-an unknown error occurred
530 End.

Checked it there is expired CRLs for CERN (there is a probe on OSG-RSV tests https://osg-ce.sprace.org.br:8443/rsv/ ), it was ok. Found a problem in our $VDT_LOCATION/glite/etc/vomses that was pointing cms to a non-existing machine. We recovered this file from backup. This problems was completely resolved when we provide, in all worker nodes, the correct link to globus/TRUSTED_CA , inside its GLOBUS_LOCATION ( /opt/OSG-wn-client ), where they can found its CRLs.

BrokerHelper: no compatible resources

All jobRobots are aborted in our farm, looking at the page http://jobrobot.web.cern.ch/JobRobot/aborted_081019.html#T2_BR_SPRACE
 BrokerHelper: no compatible resources
 request expired
First we checked some corruption in our CMSSW installation, running a crab using the same version of CMSSW pointed in http://jobrobot.web.cern.ch/JobRobot/summary_081019.html following instructions at /twiki/bin/view/Main/EntryDescriptionNo53

May be this error is relatade with an ambiguous BDII publication due a requirement that makes the matchmaking to fail, actually

Member("osg-se.sprace.org.br",other.GlueCESEBindGroupSEUniqueID) 
In our BDII was:
objectClass: GlueSchemaVersion
GlueCESEBindGroupCEUniqueID: osg-ce.sprace.org.br:2119/jobmanager-condor-cms
GlueCESEBindGroupSEUniqueID: osg-se.sprace.org.br
GlueCESEBindGroupSEUniqueID: osg-se.sprace.org.br
GlueSchemaVersionMajor: 1 
Note that GlueCESEBindGroupSEUniqueID: osg-se.sprace.org.br appears twice. To remove it, we need to fix GIP that collects information to CEMon. Changing directly the file
/OSG/gip/var/ldif/osg-info-static-cesebind.ldif
seems that it doesn't work. So, we changed the gip-attributes that is read by configure_gip to make this file
vim /OSG/monitoring/gip-attributes.conf
OSG_GIP_DISK="0"
/OSG/vdt/setup/configure_gip
And it can be checked to work with
ldapsearch -x -LLL -p 2170 -h lcg-bdii.cern.ch -b mds-vo-name=SPRACE,mds-vo-name=local,o=grid > jobrobot.txt

Updates

Fulano em dd/mm/aaaa

Coloca o que fez.

Ciclano em dd/mm/aaaa

Mais comentarios

-- MarcoAndreFerreiraDias - 19 Oct 2008

Topic revision: r4 - 2009-08-26 - MarcoAndreFerreiraDias
 

This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback

antalya escort bursa escort eskisehir escort istanbul escort izmir escort