User Tools

Site Tools


progetti:htcondor-tf:htcondor-ce

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
progetti:htcondor-tf:htcondor-ce [2020/12/03 13:17] fanzago@infn.itprogetti:htcondor-tf:htcondor-ce [2021/01/11 16:57] (current) dalpra@infn.it
Line 1: Line 1:
 +==== HTCondor-CE - manual setup ====
 +
 +=== Prerequisites  ===
 +
 +The HTCondor-CE (htc-ce) must be installed on a HTCondor Submit Node (schedd), that is a machine where a SCHEDD daemon runs:
 +<code>[root@htc-ce ~]# condor_config_val DAEMON_LIST
 +MASTER, SCHEDD
 +</code>
 +The schedd shoud be already tested, a local user shoud be able to submit jobs to the HTCondor slaves (WNs).
 +
 +Furthermore:
 +
 +   * The htc-ce must hold a valid X509 grid server certificate (IGTF)
 +   * The htc-ce must have a public IP and be reachable from everywhere to the TCP port 9619.
 +   * The htc-ce relies on Argus for authorization, so it needs an Argus server already configured for the site auth/authZ: the server could be installed on the htc-ce node but for sites with more than one CE it is recommended to have a centralized one.
 +
 +Suggestion: if you manage an HTCondor cluster, it’s a best-practice to set in all the machines (control manager, schedd, WNs, CE,…) the same uid and gid for the condor user.
 +
 +== Common HTCondor rpm installation (valid not only for the condor-ce) ==
 +The repo for the latest stable release (3.2.3 as of writing) is [[https://research.cs.wisc.edu/htcondor/yum/repo.d/htcondor-stable-rhel7.repo]]
 +<code>[root@htc-ce ~]# cd /etc/yum.repos.d
 +[root@htc-ce ~]# wget http://research.cs.wisc.edu/htcondor/yum/repo.d/htcondor-stable-rhel7.repo
 +[root@htc-ce ~]# cd /root
 +[root@htc-ce ~]# wget http://research.cs.wisc.edu/htcondor/yum/RPM-GPG-KEY-HTCondor
 +[root@htc-ce ~]# rpm --import RPM-GPG-KEY-HTCondor</code>
 +
 +Suggestion: in the slaves (WNs) it’s better to create and set the working dir for HTCondor jobs; the configuration proposed in the following lines reflect this.
 +<code>[root@htc-ce ~]# mkdir -p /home/condor/execute/
 +[root@htc-ce ~]# chown -R condor:condor /home/condor/
 +[root@htc-ce ~]# chmod 755 /home/condor/</code>
 +
 +This configuration isn't the default for HTCondor. In the WNs configuration change accordingly the EXECUTE variable value in this way\\
 +EXECUTE = /home/condor/execute/
 + 
 +This follow the strategy you may already have implemented with old CE to the placement of the user’s home and the exchange of input and output sandbox on a separate device from the one of the operating system.
 +
 +=== HTCondor-CE setup ===
 +
 +== Certificate configuration ==
 +<code>[root@htc-ce ~]# mkdir /etc/grid-security
 +[root@htc-ce ~]# cd /etc/grid-security/
 +[root@htc-ce ~]# ll
 +-rw-r--r-- 1 root root 2366 Aug 12 14:40 hostcert.pem
 +-rw----- 1 root root 1675 Aug 12 14:40 hostkey.pem</code>
 +
 +Install grid CA certificates and VO data into /etc/grid-security:
 +it needs the EGI-trustanchors.repo [[http://repository.egi.eu/sw/production/cas/1/current/repo-files/EGI-trustanchors.repo]] in /etc/yum.repos.d/
 +<code>[root@htc-ce ~]# yum install ca-policy-egi-core</code>  (CA certs go to /etc/grid-security/certificates)\\
 +Copy the "VO data" /etc/grid-security/vomsdir (lsc files in the right hierarchy). From a cream-ce it’s a matter of a simple copy:
 +<code>[root@cream-ce]# scp -r /etc/grid-security/vomsdir htc-ce:/etc/grid-security/ </code>
 +
 +== Rpms installation ==
 +* Needed RPMs are:
 +      htcondor-ce-bdii, htcondor-ce-client, htcondor-ce-view, htcondor-ce, htcondor-ce-condor
 +<code>[root@htc-ce ~]# yum install htcondor-ce-condor htcondor-ce-bdii htcondor-ce-view</code> currently condor version 8.8.10, condor-ce 3.4.3\\
 +Check the status of the service and eventually enable it
 +<code>[root@htc-ce ~]# systemctl enable condor-ce</code>
 +at the end of the configuration remember to start the condor-ce
 +<code>[root@htc-ce ~]# yum install fetch-crl</code>
 +Run the following command to update the CRLs the first time, it will take a while (several minutes) to complete so you can run it in background or from another shell:
 +<code>[root@htc-ce ~]# /usr/sbin/fetch-crl
 +[root@htc-ce ~]# systemctl enable fetch-crl-cron
 +[root@htc-ce ~]# systemctl start fetch-crl-cron</code>
 +
 +== GSI and authorization ==
 +HTCondor-CE relies on Argus for authorization. Refer to the official documentation: [[https://argus-documentation.readthedocs.io/en/stable/|ARGUS]] for general help about installation and configuration (It is possible to go with LCAS/LCMAPS but is not well documented and there is no out-of-the-box receipt for it).\\  
 +It needs the UMD-4-updates.repo in /etc/yum.repos.d/. Then
 +<code>[root@htc-ce ~]# yum install argus-gsi-pep-callout</code>
 +As root create the file /etc/grid-security/gsi-authz.conf. Its content is
 +<code>[root@htc-ce ~]# cat /etc/grid-security/gsi-authz.conf
 +globus_mapping /usr/lib64/libgsi_pep_callout.so argus_pep_callout</code>(a newline at the end is not needed)
 +It needs a copy of the certificates that will be accessed with non-root credentials:
 +<code>[root@htc-ce ~]# cp -a /etc/grid-security/hostcert.pem /etc/grid-security/condorcert.pem
 +[root@htc-ce ~]# cp -a /etc/grid-security/hostkey.pem /etc/grid-security/condorkey.pem
 +[root@htc-ce ~]# chown condor:condor /etc/grid-security/condorcert.pem /etc/grid-security/condorkey.pem</code>
 +As root create the file /etc/grid-security/gsi-pep-callout.conf
 +<code>[root@htc-ce ~]# cat /etc/grid-security/gsi-pep-callout-condor.conf
 +pep_ssl_server_capath /etc/grid-security/certificates/
 +pep_ssl_client_cert /etc/grid-security/condorcert.pem
 +pep_ssl_client_key /etc/grid-security/condorkey.pem
 +pep_url https://your.argus.server:8154/authz      ##put you argus server reference
 +pep_timeout 30 # seconds
 +xacml_resourceid http://<argus server>/condor-ce</code>
 +
 +== HTCondor-CE registration in the Argus server ==
 +The HTCondor-CE has to be registered in the Argus server.\\
 +In the Argus server, as root, execute:\\
 +<code>[root@argus ~]# pap-admin lp >policies.txt      ## dump current policies
 +[root@argus ~]# cp policies.txt policies-htcceAdd.txt</code>
 +
 +Add the new resource to the service (here an example, modify it as needed):
 +<code>[root@argus ~]# cat policies-htcceAdd.txt                             
 +
 + resource "http://your.argus.server/condor-ce" {              ## put your argus server reference
 +     obligation "http://glite.org/xacml/obligation/local-environment-map" {
 +     }
 +
 +     action ".*" {
 +         rule permit { vo="ops" }
 +         rule permit { vo="dteam" }
 +         rule permit { vo="alice" }
 +         rule permit { vo="belle" }
 +         rule permit { vo="cms" }
 +         rule permit { vo="lhcb" }
 +         rule permit { vo="cdf" }
 +     }
 + }</code>
 +  
 +Reset old policies and import new one:  
 +<code>[root@argus ~]# pap-admin rap ; pap-admin apf policies-htcceAdd.txt 
 +[root@argus ~]# pap-admin lp      ## list policies </code>
 +
 +You can consider installing the Argus service on the HTCondor-CE host itself. This would probably ease early setup.\\
 +Be aware that Argus need read/write access to the /etc/grid-security/gridmapdir directory, which is usually hosted on a shared filesystem.
 +
 +== Argus client installation ==
 +Install an Argus pep client on your HTCondor-CE
 +<code>[root@htc-ce ~]# yum install argus-pepcli</code>
 +
 +== Verify Argus service registration ==
 +To verify that your Argus service is properly configured to work with your htc-ce you have to:
 +  * Create a valid proxy of a supported VO i.e.: <code>[root@htc-ce ~]# voms-proxy-init --voms cms</code>
 +  * Copy the proxy on the root dir of your htc-ce as ''user509.pem''
 +  * Execute the following example command (adapt to your case)
 +<code>[root@htc-ce ~]# pepcli --pepd https://your.argus.server:8154/authz --keyinfo user509.pem --capath /etc/grid-security/certificates --cert /etc/grid-security/hostcert.pem --key /etc/grid-security/hostkey.pem --resourceid "http://your.argus.server/condor-ce" -a pippo  ##put your argus server reference, -a pippo is needed as a whatever string</code> 
 +
 +On a working setup you should see an output like:
 +
 +<code>Resource: http://your.argus.server/condor-ce
 +Decision: Permit
 +Obligation: http://glite.org/xacml/obligation/local-environment-map/posix (caller should resolve POSIX account mapping)
 +Username: cms195
 +Group: cms
 +Secondary Groups: cms</code>
 +
 +As a further check (i.e. Argus service) you should see that the empty file
 +''/etc/grid-security/gridmapdir/cms195'' has a static link whose name is derived from the User DN Subject, which can be identified this way:
 +
 +<code>[root@argus ~]# ls -li /etc/grid-security/gridmapdir/cms195
 +383663 -rw-r--r-- 2 root root 0 26 giu  2019 /etc/grid-security/gridmapdir/cms195
 +[root@argus ~]# ls -li /etc/grid-security/gridmapdir/ | grep 383663
 + 383663 -rw-r--r-- 2 root root 0 26 giu  2019 %2fdc%3dch%2fdc%3dcern%2fou%3dorganic%20units%2fou%3dusers%2fcn%3dsurname%2fcn%3d606831%2fcn%3dname%20surname:cms:cmsprd:cms
 + 383663 -rw-r--r-- 2 root root 0 26 giu  2019 cms195</code>
 +
 +== HTCondor-CE mapfile ==
 +//An entry has to be ADDED to the condor_mapfile to match the certificate DN for the hosts at your site.
 +These are mapped to the value defined by UID_DOMAIN (i.e. "t2-lp" in the following example) in HTCondor and HTCondor-CE.// 
 +
 +<code>[root@htc-ce ~]# grep -v -E '^#|^$' /etc/condor-ce/condor_mapfile
 +GSI "\/C=IT\/L=Frascati\/O=Istituto Nazionale di Fisica Nucleare\/.*CN=([A-Za-z0-9\-]*).lnl.infn.it$" \1@t2-lp
 +GSI (.*) GSS_ASSIST_GRIDMAP
 +GSI "(/CN=[-.A-Za-z0-9/= ]+)" \1@unmapped.htcondor.org  (default)
 +CLAIMTOBE .* anonymous@claimtobe                                (default)
 +FS (.*) \1                                                      (default)
 +</code>
 +Note on the first line the regular expression to match a set of Site certificates, it’s a perl RegExp.\\
 +The GSI (.*) GSS_ASSIST_GRIDMAP line is the magic that exploit the external argus auth/authZ:
 +do not modify all the other default lines.\\
 +Warning: the host that is installing as HTCondor-CE is also a HTCondor Submit Node. For this reason in both the condor configuration files the value of UID_DOMAIN has to be the same.
 +
 +== HTCondor-CE config files ==
 +The default configuration path is /etc/condor-ce/config.d/. A tool ''condor_ce_config_val'' is provided to inspect configurations, in the same way as condor_config_val for HTCondor batch system.
 +
 +condor_ce_config_val usage examples:
 +   * find where and how is defined an identifier (Knob) whose name is exactly known: 
 +<code>[root@htc-ce ~]# condor_ce_config_val -v HTCONDORCE_VONames
 +HTCONDORCE_VONames = alice, atlas, cdf, cms, dteam, lhcb, virgo
 +# at: /etc/condor-ce/config.d/06-ce-bdii.conf, line 14 </code>
 +
 +   * see names and value for identifiers matching a part:
 +<code>[root@htc-ce ~]# condor_ce_config_val -d HTCONDORCE
 +# Configuration from machine: ce03-htc.cr.cnaf.infn.it
 +# Parameters with names that match HTCONDORCE:
 +HTCONDORCE_BDII_ELECTION = LEADER
 +HTCONDORCE_BDII_LEADER = ce03-htc.cr.cnaf.infn.it
 +HTCONDORCE_CORES = 16 # cores per node
 +[…] </code> 
 +
 +Most of the predefined values already have reasonable values and there should be no reason to alter them; by individually inspecting the configuration files there are comments providing hints about what should be customized.
 +
 +Worth to mention entries are :
 +<code>HTCONDORCE_VONames = alice, atlas, cdf, cms, dteam, lhcb, virgo
 +DAEMON_LIST = $(DAEMON_LIST), CEVIEW, GANGLIAD</code>
 +from file /etc/condor-ce/config.d/05-ce-view.conf. This enables a CE monitoring webtool (CEView) which is visible as http://your.ce.fully.qualified.name:80
 +
 +Warning: the best practice should be to edit only the file /etc/condor-ce/config.d/99-local.conf and a file with the def. of jobrouter. In this way the software updates don't overwrite local configuration file.\\
 +
 +The following are examples based on configuration in Legnaro-Padova T2:
 +<code>
 +[root@t2-cce-03 ~]# cd /etc/condor-ce/config.d/
 +[root@t2-cce-03 config.d]# ll
 +total 36
 +-rw-r--r-- 1 root root 2351 Jul  1 23:48 01-ce-auth.conf
 +-rw-r--r-- 1 root root 4000 Jul  2 16:20 01-ce-router.conf
 +-rw-r--r-- 1 root root 1029 Jul  1 23:48 01-common-auth.conf
 +-rw-r--r-- 1 root root 1149 Jul  1 23:48 01-pilot-env.conf
 +-rw-r--r-- 1 root root  476 Jul  1 23:48 02-ce-condor.conf
 +-rw-r--r-- 1 root root  240 Jul  1 23:48 03-managed-fork.conf
 +-rw-r--r-- 1 root root  364 Jul  1 23:48 05-ce-health.conf
 +-rw-r--r-- 1 root root  902 Jul  1 23:48 05-ce-view.conf
 +-rw-r--r-- 1 root root  291 Jul  1 23:48 50-ce-apel.conf
 +lrwxrwxrwx 1 root root   47 Oct 14 15:28 99-z10-ce-site-security.conf -> /sx/condor/shared/10-ce-grid-site-security.conf
 +lrwxrwxrwx 1 root root   44 Oct 14 15:28 99-z20-ce-job-routes.conf -> /sx/condor/shared/20-ce-grid-job-routes.conf
 +lrwxrwxrwx 1 root root   39 Oct 14 15:28 99-z30-ce-local.conf -> /sx/condor/shared/30-ce-grid-local.conf
 +
 +[root@t2-cce-03 ~]# cat /etc/condor-ce/config.d/99-z10-ce-site-security.conf
 +DEFAULT_DOMAIN_NAME = lnl.infn.it
 +
 +FRIENDLY_DAEMONS = $(FULL_HOSTNAME)@daemon.htcondor.org/$(FULL_HOSTNAME) condor@users.htcondor.org/$(FULL_HOSTNAME) condor@child/$(FULL_HOSTNAME), $(FULL_HOSTNAME)@$(UID_DOMAIN)/$(FULL_HOSTNAME), *@$(UID_DOMAIN), condor@$(UID_DOMAIN)/$(FULL_HOSTNAME), condor@child/$(FULL_HOSTNAME), *@$(DEFAULT_DOMAIN_NAME)
 +USERS = *@users.htcondor.org, *@$(UID_DOMAIN)
 +ALLOW_DAEMON = $(FRIENDLY_DAEMONS), $(FRIENDLY_DAEMONS)
 +SCHEDD.ALLOW_WRITE = $(USERS), $(FULL_HOSTNAME)@daemon.htcondor.org/$(FULL_HOSTNAME), $(FULL_HOSTNAME)@$(UID_DOMAIN)/$(FULL_HOSTNAME), *@$(UID_DOMAIN)
 +COLLECTOR.ALLOW_ADVERTISE_MASTER = $(FRIENDLY_DAEMONS), $(FRIENDLY_DAEMONS), *@$(UID_DOMAIN)
 +COLLECTOR.ALLOW_ADVERTISE_SCHEDD = $(FRIENDLY_DAEMONS), $(FRIENDLY_DAEMONS), *@$(UID_DOMAIN)
 +COLLECTOR.ALLOW_ADVERTISE_STARTD = $(UNMAPPED_USERS), $(USERS), $(FRIENDLY_DAEMONS), *@$(UID_DOMAIN)
 +SCHEDD.ALLOW_NEGOTIATOR = $(FULL_HOSTNAME)@daemon.htcondor.org/$(FULL_HOSTNAME), $(FULL_HOSTNAME)@$(UID_DOMAIN)/$(FULL_HOSTNAME), *@$(UID_DOMAIN)
 +ALLOW_ADMINISTRATOR = $(FULL_HOSTNAME)@daemon.htcondor.org/$(FULL_HOSTNAME), $(FULL_HOSTNAME)@$(UID_DOMAIN)/$(FULL_HOSTNAME), condor@$(UID_DOMAIN)/$(FULL_HOSTNAME)
 +
 +
 +[root@t2-cce-03 ~]# cat /etc/condor-ce/config.d/99-z20-ce-job-routes.conf
 +JOB_ROUTER_ENTRIES @=jre
 +[
 +name = "condor_pool_dteam";
 +TargetUniverse = 5;
 +Requirements = (regexp("dteam", TARGET.x509UserProxyVoName));
 +#copy_requirements = "original_requirements";
 +#set_requirements = original_requirements;
 +MaxJobs = 20;
 +MaxIdleJobs = 10;
 +]
 +[
 +name = "condor_pool_cms";
 +TargetUniverse = 5;
 +Requirements = (regexp("cms", TARGET.x509UserProxyVoName));
 +#copy_requirements = "original_requirements";
 +set_requirements = (TARGET.t2_wn_cms =?= true);
 +MaxJobs = 1000;
 +MaxIdleJobs = 500;
 +]
 +@jre
 +
 +[root@t2-cce-03 ~]# cat /etc/condor-ce/config.d/99-z30-ce-local.conf
 +# ALL_DEBUG = D_FULLDEBUG D_CAT
 +# ALL_DEBUG = D_FULLDEBUG D_CAT D_SECURITY
 +# MASTER_DEBUG = D_ALWAYS:2 D_CAT
 +# SCHEDD_DEBUG = D_ALWAYS:2 D_CAT
 +# GSS_ASSIST_GRIDMAP_CACHE_EXPIRATION = 0
 +
 +# NETWORK_HOSTNAME = ce01-htc.cr.cnaf.infn.it
 +# NETWORK_INTERFACE = 131.154.193.64
 +
 +# emable ceview app
 +DAEMON_LIST = $(DAEMON_LIST), CEVIEW, GANGLIAD
 +
 +UID_DOMAIN = t2-lp
 +
 +# t2-ccm-03.lnl.infn.it is the Central Manager
 +JOB_ROUTER_SCHEDD2_POOL = t2-ccm-03.lnl.infn.it:9618
 +START_LOCAL_UNIVERSE = False
 +START_SCHEDULER_UNIVERSE = $(START_LOCAL_UNIVERSE)
 +SUBMIT_EXPRS = getenv
 +DELEGATE_JOB_GSI_CREDENTIALS_LIFETIME = 0
 +MERGE_JOB_ROUTER_DEFAULT_ADS = True
 +JOB_ROUTER_DEFAULTS = $(JOB_ROUTER_DEFAULTS_GENERATED) [set_Periodic_Hold = (NumJobStarts >= 1 && JobStatus == 1) || NumJobStarts > 1;]
 +
 +HTCONDORCE_SPEC = [ specfp2000 = 3326; hep_spec06 = 11.088; specint2000 = 2772 ]
 +</code>
 +The files before "99-z*" are installed by the condor RPMs and you should not modify them.
 +
 +The UID_DOMAIN is the same you chosen on the condor batch system side. It’s suggested to not set it as a network domain name as to distinguish objects on the logs.\\
 +\\
 +The job router config is a very basic example!\\
 +The ''copy_requirements = "original_requirements";'' statement is for advanced configuration to exploit very well defined user requirements.\\
 +''t2_wn_cms'' is a custom tag defined internally by the site on condor batch system WNs dedicated to CMS experiment.
 +
 +== BDI configuration in the HTCondor batch system ==
 +
 +The rpm creates two configuration files and python script:
 +<code>[root@htc-ce ~]# rpm -ql htcondor-ce-bdii
 +/etc/condor/config.d/50-ce-bdii-defaults.conf
 +/etc/condor/config.d/99-ce-bdii.conf
 +/var/lib/bdii/gip/provider/htcondor-ce-provider </code>
 +
 +Warning: the path is under ''/etc/condor/'', thus these are HTCondor configurations, not HTCondor-CE; 
 +
 +In the /etc/condor/config.d/99-ce-bdii.conf write here details about your site and BDII node.\\
 +Here the example related to Legnaro
 +<code>[root@htc-ce ~]# cat /etc/condor/config.d/99-ce-bdii.conf
 +HTCONDORCE_SiteName = INFN-LNL-2
 +HTCONDORCE_SPEC = [ specfp2000 = 2506; hep_spec06 = 10.63; specint2000 = 2656 ]
 +# CPU  Benchmarks
 +HTCONDORCE_VONames = alice, cms, lhcb, dteam
 +HTCONDORCE_BDII_ELECTION = LEADER
 +HTCONDORCE_BDII_LEADER = t2-cce-02.lnl.infn.it
 +HTCONDORCE_CORES = 16 # cores per node
 +GLUE2DomainID = $(HTCONDORCE_SiteName) </code>
 +
 +To check that the configuration is formally fine just execute 
 +<code>[root@htc-ce ~]# /var/lib/bdii/gip/provider/htcondor-ce-provider </code>
 +a dump of the glue2 schema shoud appear on stdout.
 +
 +Finally, activate the service with 
 +<code>[root@htc-ce ~]# systemctl enable bdii
 +[root@htc-ce ~]# systemctl start bdii </code>
 +
 +== Create grid-users in the HTCondor-CE ==
 +You must define all the local users that the Argus will map on the CE and all the WNs (it is supposed that you already gridified the WNs with also the same local users)
 +Here an example based on yaim (CE side):
 +<code>[root@htc-ce ~]# yum install glite-yaim-core
 +[root@htc-ce ~]# /opt/glite/yaim/bin/yaim -r -s </path/to>/my-site-info.def -f config_users
 +2>&1 | tee /root/conf_yaim_users.`hostname -s`.`date +%Y-%m-%d-%H-%M-%S`.log </code>
 +as default this command returns “INFO: Assuming the node types: UI”. Press “y” because the config_users isn’t supported in the UI profile.
 +It’s supposed that your WNs are all “grid-ified” (meddleware, users, configs..)
 +
 +== Start HTCondor-CE process ==
 +<code>[root@htc-ce ~]# systemctl start condor-ce </code>
 +NB: there is a bug that cause the stop of the daemon to be very slow (order of minutes): at the end the master daemon will issue a kill -9. You can do it yourself without worries.
 +
 +== Testing the HTCondor-CE ==
 +
 +From a User Interface having the htcondor-ce-client rpm, after generating a valid proxy, the htc-ce can be tested by
 +<code>[root@ui ~]# condor_ce_trace <fqdn del CE> </code>
 +use the –debug option to get more details.
 +
 +
 +=== Tips and trick ===
 +
 +== How to change the uid and gid of an already created condor user ==
 +993 old condor uid,will be mapped to 601, 990 old condor group will be mapped to 601
 +<code>stop condor
 +id condor
 +find / -xdev -uid 993 -exec ls -lnd {} \; | wc -l
 +find / -xdev -gid 990 -exec ls -lnd {} \; | wc -l
 +vim /etc/passwd
 +vim /etc/group
 +find / -xdev -uid 993 -exec chown 601 {} \;
 +find / -xdev -gid 990 -exec chgrp 601 {} \;
 +find / -xdev -uid 601 -exec ls -lnd {} \; | wc -l
 +find / -xdev -gid 601 -exec ls -lnd {} \; | wc -l
 +reboot </code>
 +
 +== If condor does not start correctly (config from shared FS) ==
 +Sometimes, if you have the HTCondor configuration of nodes in a shared (nfs, gpfs,…) file system, it could happen the nfs client is ready even if not all mount points still completely mounted.\\
 +The solution is to add the mount point (in the example is named “sx”) in the /usr/lib/systemd/system/condor.service file:
 +<code>After=network-online.target nslcd.service ypbind.service time-sync.target
 +nfs.client.target autofs.service sx.mount </code>
 +The example is referred to a mount point described in the fstab as:
 +<code>gwc:/sx                 /sx                     nfs     defaults,bg,intr   0 0 </code>
 +
 +== Alternative solution to Argus service (to be implemented) ==
 +Assuming to use LCMAPS, the configuration of files is like the configuration of cream-ce
 +<code>[root@cream-ce cream]# cat /etc/grid-security/gsi-authz.config
 +globus_mapping /usr/lib64/liblcas_lcmaps_gt4_mapping_.so lcmaps_callout
 +[root@htc-ce config.d]# cat /etc/grid-security/gsi-authz.conf
 +globus_mapping liblcas_lcmaps_gt4_mapping.so lcmaps_callout </code>
 +(gsi-authz.conf must end with a new line as reported in [[https://opensciencegrid.org/docs/security/lcmaps-voms-authentication/#optional-configuration]])
 +
 +== /etc/grid-security content details ==
 +If you have the Argus server installed on another host, the only files you need are:
 +<code>drwxr-xr-x  2 root   root   53248 Aug 26 14:40 certificates (managed by fetch-crl)
 +-rw-r--r--  1 condor condor  2366 Aug 12 14:40 condorcert.pem
 +-rw-------  1 condor condor  1675 Aug 12 14:40 condorkey.pem
 +-rw-r--r--  1 root   root      66 Aug 26 14:53 gsi-authz.conf
 +-rw-r--r--  1 root   root    3121 Aug 26 17:48 gsi.conf (from condor install, generally you do not need to modify it)
 +-rw-r--r--  1 root   root     502 Aug 27 16:40 gsi-pep-callout.conf
 +-rw-r--r--  1 root   root    2366 Aug 12 14:40 hostcert.pem
 +-rw-------  1 root   root    1675 Aug 12 14:40 hostkey.pem
 +drwxr-xr-x 13 root   root    4096 Sep 11  2016 vomsdir </code>
 +
 +In case you install the Argus server on the same node, or you plan to use lcas/lcmaps, you will need all the staff you know very well from a CREAM CE, so in addition you must have:
 +<code>lrwxrwxrwx  1 root   root      43 Aug 26 16:26 gridmapdir -> /shared/from/file/system
 +-rw-r--r--  1 root   root    2938 Aug 26 18:29 grid-mapfile
 +-rw-r--r--  1 root   root    2854 Aug 26 18:24 groupmapfile
 +-rw-r--r--  1 root   root    2938 Aug 20 12:26 voms-grid-mapfile
 +-rw-r--r--  1 root   root    2938 Aug 20 12:26 voms-mapfile </code>
 +The gridmapdir is shared among all nodes that actually map a grid user to a local user.\\
 +The voms-grid-mapfile and voms-mapfile generally are a copy of the grid-mapfile and voms-mapfile; on the Argus serverinstance  they are not needed (see the argus config files to track witch files are sourced).
 +
 +== Alias for HTCondor command ==
 +It’s useful to define the alias in /root/.bashrc
 +<code>alias ccv='condor_config_val'
 +alias cccv='condor_ce_config_val' </code>
 +
 +== The restart of HTCondor startd (WNs) kills running jobs ==
 +Instead of the restart of services, in order to apply a new configuration, use the ''condor_reconfig command''.
 +
 +== gsi-pep-callout.conf (file name details) ==
 +If you already have a production Argus, your defaults could be different and depends of what decided by your farming department that manage the hosts installation and configuration processes (puppet/foreman/whatever,…).\\
 +The name of this file come from the definition file here (the version numbers could be different):\\
 +''/usr/share/doc/argus-gsi-pep-callout-1.3.1/gsi-pep-callout.conf''\\
 +(installed by argus-gsi-pep-callout-1.3.1-2.el7.centos.x86_64.rpm) and could be modified in:\\
 +''/etc/sysconfig/condor-ce''
 +
  
progetti/htcondor-tf/htcondor-ce.txt · Last modified: 2021/01/11 16:57 by dalpra@infn.it

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki