Update the JSDC in 2020 (v2020 ?)

This page gathers notes on the overall process to obtain the updated JSDC catalog (v2020) based on Simbad SPTYPE + VJHK mags (tycho2+2MASS).

SearchCal changes (2020)

The JSDC scenarios retrieve GAIA information (I/345 (DR2), I/347 (dist)) by using a crossmatch with radius = 1as (ASCC vs GAIA positions) and the SearchCal server gives more details on its crossmatch algorithm (number of mates within 5as, separation, separation between the first and the 2nd object).

Problem: as GAIA coordinates are the most accuracte, RA/DE pmRA/DE are now overwritten with values from the corresponding GAIA target. However it leads to confusion for close binaries or ASCC duplicates (within 1 as), as several rows in the JSDC output will have the same coordinates (and same GAIA ID) ~ 5000 groups.

2020.09: filter out all these potential conflicting sources (duplicates) in the sclcatFilterCatalog.sh script.

Meta-data of New SearchCal columns:

Only main column are listed (xx.confidence and xx.origin are ommitted):

  • Added GAIA columns:
GAIA(String) - GAIA DR2 Release Catalog name, click to call VizieR on this object

Bp(Double)/mag - GAIA: Integrated Bp mean magnitude (Vega)
e_Bp(Double)/mag - GAIA: Standard error of BP mean magnitude (Vega)

G(Double)/mag - GAIA: G-band mean magnitude (Vega)
e_G(Double)/mag - GAIA: Standard error of G-band mean magnitude (Vega)

Rp(Double)/mag - GAIA: Integrated Rp mean magnitude (Vega)
e_Rp(Double)/mag - GAIA: Standard error of Rp mean magnitude (Vega)

e_RadVel(Double)/km/s - Radial velocity error 

gaia_dist(Double)/pc - GAIA: Estimated distance
b_gaia_dist(Double)/pc - GAIA: Lower bound on the confidence interval of the estimated distance
B_gaia_dist(Double)/pc - GAIA: Upper bound on the confidence interval of the estimated distance

Teff(Double)/K - GAIA: Stellar effective temperature (estimate from Apsis-Priam)
b_Teff(Double)/K - GAIA: Uncertainty (lower) on Teff estimate from Apsis-Priam (16th percentile)
B_Teff(Double)/K - GAIA: Uncertainty (upper) on Teff estimate from Apsis-Priam (84th percentile)

Stilts script to plot HR diagram (gaia_dist):

stilts plot2plane \
   xpix=679 ypix=495 \
   xlabel=V-K ylabel='-(V-5.0*(log10(gaia_dist)-1))' grid=true \
   xmin=-3.31 xmax=12.95 ymin=-14.2 ymax=11 \
   legend=true \
   layer=Mark \
      in='match(12,6)' \
      x=V-K y='-(V-5.0*(log10(gaia_dist)-1))' \
      shading=auto size=0 \
      leglabel='26: All' 

Note: columns with 'b_' or 'B_' prefixes may cause problems in IDL scripts (Uppercase in MWRFITS)

  • New XMATCH columns (XM_ prefix):
The new XMATCH_LOG column gives lots of information about the xmatch steps performed on main catalogs (ASCC, HIP, GAIA, 2MASS, WISE) for every target.

XMATCH_LOG(String) - xmatch log (internal JMMC)
XM_ASCC_n_mates(Integer) - Number of mates within 5 as in ASCC catalog
XM_ASCC_sep(Double)/as - Angular Separation of the first object in ASCC catalog
XM_ASCC_sep_2nd(Double)/as - Angular Separation between first and second objects in ASCC catalog

XM_HIP_n_mates(Integer) - Number of mates within 5 as in HIP1/2 catalogs
XM_HIP_sep(Double)/as - Angular Separation of the first object in HIP1/2 catalogs
XM_HIP_sep_2nd(Double)/as - Angular Separation between first and second objects in HIP1/2 catalogs

XM_GAIA_n_mates(Integer) - Number of mates within 5 as in GAIA catalog
XM_GAIA_sep(Double)/as - Angular Separation of the first object in GAIA catalog
XM_GAIA_sep_2nd(Double)/as - Angular Separation between first and second objects in GAIA catalog

XM_2MASS_n_mates(Integer) - Number of mates within 5 as in 2MASS catalog
XM_2MASS_sep(Double)/as - Angular Separation of the first object in 2MASS catalog
XM_2MASS_sep_2nd(Double)/as - Angular Separation between first and second objects in 2MASS catalog

XM_WISE_n_mates(Integer) - Number of mates within 5 as in WISE catalog
XM_WISE_sep(Double)/as - Angular Separation of the first object in WISE catalog
XM_WISE_sep_2nd(Double)/as - Angular Separation between first and second objects in WISE catalog

JMDC inputs

See the complete procedure at /SCALIB/DEVELOPMENT/alx/config/JMDC/scriptHandleLastVersionOf_JMDC_dot_xls_ToProduceJMDC_dot_fits.sh

The script scriptHandleLastVersionOf_JMDC_dot_xls_ToProduceJMDC_dot_fits.sh performs the following operations:

  • prepare the JMDC (csv) input catalog by using the GetStar service (sclsvrServer on the local machine) to get photometries + SIMBAD information (sp type) and merge with stilts => 'JMDC..._intermediate.fits' file
  • (optional) decode SpType using alxDecodeSpectralType test program => 'JMDC..._final.fits' file
  • correct LD_DIAM using the gdl script 'update_ld_in_jmdc.pro' (TBD in SearchCal) => 'JMDC..._final_lddUpdated.fits' file
  • correct LD_DIAM using the gdl script 'make_jsdc_script_simple.pro' (many arguments and options, see below)

Changes in 2020:

This script has the following dependencies: stilts + gdl (astro lib ie MRDFITS + MPFIT procedures):

    export GDL_PATH="/home/bourgesl/apps/astron/pro/:/home/bourgesl/dev/gdl/src/pro/:/home/bourgesl/dev/gdl/src/pro/CMprocedures/"

Inputs:

  • JMDC: get latest JMDC catalog from http://jmdc.jmmc.fr/ by clicking on the CSV button (top)
  • JSDC (full or filtered database) to compute JSDC using the gdl script (all SearchCal columns needed)

Commands:

 ./scriptHandleLastVersionOf_JMDC_dot_xls_ToProduceJMDC_dot_fits.sh JMDC_full_20200730.csv   # to make all steps on the given JMDC catalog

 gdl -e "make_jsdc_script_simple,\"JMDC_full_20200730_final_lddUpdated.fits\",verbose=1,nocatalog=1"    # to compute only polynoms (JMDC fit)

 gdl -e "make_jsdc_script_simple,\"JMDC_full_20200730_final_lddUpdated.fits\",\"catalog2020.fits\",verbose=1"    # to perform both JMDC fitting and JSDC computation

Notes on JMDC fitting:

  • errors on magnitudes are corrected in the make_jsdc_polynoms.pro script (EMAG_MIN = 0.01 on B/V, EMAG[JHK] = max(EMAG[JHK]) : TODO check JSDC computation + SearchCal code
  • Bands [VJ VH VK] could be changed LATER to use other magnitudes (G, N...) but now the database has too few samples (G) or the min/max(EMAG) must be corrected on these bands
  • to get more samples (use DSPECTRAL_TYPE LE 4), see '; LBO: try ignoring small DSPTYPE_B <= 4'

JSDC candidate catalog (to build the JSDC database)

See procedure at /vobs/config/prepare_candidates/proc.sh

This script generates the 2 input files in the vobs/config folder for the JSDC_BRIGHT (vobsascc_simbad_sptype.cfg) / JSDC_FAINT (vobsascc_simbad_no_sptype.cfg) input catalogs (candidates) used to run SearchCal scenarios and obtain the JSDC full databases (data + computations).

How to update the input catalogs ?

 ./proc.sh

Overall description of the proc.sh :

  • get both ASCC & SIMBAD full catalogs via cdsxmatch.u-strasbg.fr/QueryCat queries (vot)
  • crossmatch catalogs using radius = 2 as (all from 1) on J2000 coords (corrected by PM) + compute group size at 5 as in both catalogs (GROUP_SIZE)
  • produce input catalogs in SearchCal vobsCatalog format (ascii art):
    • RAJ2000/DEJ2000, pmRA, pmDE (ASCC)
    • MAIN_ID, SP_TYPE, OTYPES (Simbad at 2as)
    • GROUP_SIZE (max GroupSize in both catalogs at 5as)
  • split the full ASCC in 2 lists (NULL_SP_TYPE / NULL_SP_TYPE) (2017-04-14)
    • vobsascc_simbad_sptype.cfg: 494401 rows
    • vobsascc_simbad_no_sptype.cfg: 2006912 rows

Comments:

  • ASCC is a catalog compilation (TYCHO2 + others) that contains many 'duplicates' with 2as (tycho2 components or other known binaries ...)
  • Simbad coordinates (used by crossmatch) now use Gaia DR2 ones for 3M objects where SIMBAD retrieved the 'unique' GAIA object

Notes:

  • these files were produced at 2017-04-14, not updated in 2020 as a cds-healpix-java bug (in 2020 stilts/topcat releases) make the procedure to fail (simbad 11M objects):
processing @ date: mardi 15 septembre 2020, 09:59:32 (UTC+0000)
Params: Max Error(Number)/arcsec=5.0
Tuning: HEALPix k(Integer)=11
Binning rows for table 2.............................................
Error: Hash value 13421764095 must be in [0, 12884901888[
  • 2020: using former stilts release (3.7), the procedure works and gives 500 591 rows for the JSDC_BRIGHT candidate catalog (ASCC + SIMBAD < 2 as with SPTYPE)

How to run SearchCal JSDC scenarios ?

This scenarios are still executed on the server apps-old.jmmc.fr on my local account (high memory, high disk usage).

The sclsvrServer will load the JSDC local catalog (vobsascc_simbad_....cfg) as a primary list (no duplicate checks) and then query Vizier catalogs to complete the (vobsSTAR) database (caching). Once the scenario is done (all queries OK), then it will compute calibrators (angular diameters, missing mags ...) and finally save the jsdc.vot (votable).

Scripts and outputs are located at: /home/users/bourgesl/dev/SCALIB/DEVELOPMENT/sclsvr/test/JSDC

# Edit the script runJSDC.sh:
vi runJSDC.sh
# bright or faint JSDC scenario:
export BRIGHT=true    # true (BRIGHT ~ 500K) or false (FAINT ~ 2M)

# Run the scenario:
nohup ./runJSDC.sh &

Running a scenario is very long (~ 1 week to perform all Vizier queries) and may encounter CDS network errors.

It will produce:

  • the runJSDC.log (full sclsvr log)
  • intermediate Vizier (cached) results (vobsSTAR_LIST) are stored at: /home/users/bourgesl/MCSTOP/data/tmp
-rw-r--r-- 1 bourgesl jmmc 273681398 Jul 29 18:20 Search_JSDC_BRIGHT_1_K_I_280B_1.log
-rw-r--r-- 1 bourgesl jmmc 367164719 Jul 29 23:59 Search_JSDC_BRIGHT_2_K_I_280_2.log
-rw-r--r-- 1 bourgesl jmmc  79742526 Jul 30 02:11 Search_JSDC_BRIGHT_3_K_I_311_hip2_2.log
-rw-r--r-- 1 bourgesl jmmc  83636898 Jul 30 04:24 Search_JSDC_BRIGHT_4_K_I_239_hip_main_2.log
-rw-r--r-- 1 bourgesl jmmc 556181610 Jul 30 22:26 Search_JSDC_BRIGHT_5_K_I_345_gaia2_2.log
-rw-r--r-- 1 bourgesl jmmc 331581141 Aug  1 03:18 Search_JSDC_BRIGHT_6_K_I_347_gaia2dis_2.log
-rw-r--r-- 1 bourgesl jmmc 359440257 Aug  1 05:24 Search_JSDC_BRIGHT_7_K_II_246_out_2.log
-rw-r--r-- 1 bourgesl jmmc 319040280 Aug  1 23:01 Search_JSDC_BRIGHT_8_K_II_328_allwise_2.log
-rw-r--r-- 1 bourgesl jmmc   3075121 Aug  2 02:35 Search_JSDC_BRIGHT_9_K_II_7A_catalog_2.log
-rw-r--r-- 1 bourgesl jmmc  62685839 Aug  2 16:20 Search_JSDC_BRIGHT_10_K_I_196_main_2.log
-rw-r--r-- 1 bourgesl jmmc   5037541 Aug  2 20:28 Search_JSDC_BRIGHT_11_K_V_50_catalog_2.log
-rw-r--r-- 1 bourgesl jmmc   1186483 Aug  2 22:36 Search_JSDC_BRIGHT_12_K_V_36B_bsc4s_2.log
-rw-r--r-- 1 bourgesl jmmc   1792599 Aug  3 00:16 Search_JSDC_BRIGHT_13_K_B_sb9_main_2.log
-rw-r--r-- 1 bourgesl jmmc  37934632 Aug  3 01:50 Search_JSDC_BRIGHT_14_K_B_wds_wds_2.log
-rw-r--r-- 1 bourgesl jmmc  92208790 Aug  3 04:28 Search_JSDC_BRIGHT_15_K_II_297_irc_2.log
  • Final Scenario (cached) results (vobsSTAR_LIST) are stored at: /home/users/bourgesl/MCSTOP/data/tmp/GetCal
-rw-r--r-- 1 bourgesl jmmc  403188594 Jun  1  2017 SearchListBackup_JSDC_BRIGHT.dat
-rw-r--r-- 1 bourgesl jmmc 1512361965 Jun  1  2017 SearchListBackup_JSDC_FAINT.dat

Notes:

  • The output jsdc.vot table is complete ie it still has all input objects
  • Filtering (diamFlag/duplicates) is necessary, see below:

Filtering SearchCal (JSDC) votable

The sclcat/src/sclcatFilterCatalog.sh is the script processing the raw SearchCal jsdc.vot (all columns + all rows) to filter / mark rows.

Overall description of the sclcatFilterCatalog.sh:

  • 2020: Check duplicates
    • Filter identifiers(SIMBAD, GAIA, 2MASS IDs) to ensure proper identification on these main catalogs
    • Keep only TYCHO2 1st component (select TYC3=='1')
    • TODO: decide if keep0 (remove all) or keep1 (1st kept)
    • TODO: check GroupSize (ASCC vs SIMBAD < 2as), XM_..._n_mates (5as) or XM_..._sep2nd values
    • should we mark groups (new proximity flag or ambiguous match) or remove all (YES for now) ?
    • using the synthetic column MIN_SEP_2ND min(XM_HIP_sep_2nd,XM_ASCC_sep_2nd,XM_GAIA_sep_2nd,XM_2MASS_sep_2nd,XM_WISE_sep_2nd,5.0) could help selecting unambiguous matches (ie separation to the 2nd candidate) is higher than 2 as ?
  • (Disabled since 2017.5 > JSDC2_GD) remove all duplicates (HIP, HD, DM, duplicated coordinates)
  • (Disabled)
  • Keep stars with diamFlag==1 (consistent diameters)
  • (Disabled) compute CalFlag (doner in SearchCal / GDL script)
    • bit 0: (diam_chi2>5) => 1
    • bit 1: SB9 / WDS (sep < 2as) : (NULL_SBC9||sep1<=2.||sep2<=2.) => 1
    • bit 2: check SIMBAD ObjectTypes (ObjectTypes_2017.ods) => 1
  • Rejecting badcal stars (remove all distance < 5as)
  • Store intermediate filtered JSDC (filtered.vot) (all columns)
  • Filter out columns to make (small) JSDC and store final JSDC (final.fits)

* Run /home/users/bourgesl/JSDC/filter.sh and publish the results

* JSDC snapshots (+ logs) are stored at: http://jmmc.fr/~bourgesl/sclsvr_JSDC/

Note: the script generates many intermediate fits files (catalog5.fits for example) that can be used as the JSDC (full or filtered catalog) of the GDL fitting (+JSDC2) script (all SearchCal columns are needed).

JSDC x MDFC

Latest MDFC (mdfc10.fits) from https://matisse.oca.eu/foswiki/bin/view/Main/TheMid-infraredStellarDiametersAndFluxesCompilationCatalogue%28MDFC%29

A catalogue of stellar diameters and fluxes for mid-infrared interferometry
P. Cruzalèbes
Article: https://arxiv.org/abs/1910.00542

MDFC-10 stats: Name / SpType (Simbad), RAJ2000/DEJ2000 (JSDC or GAIA ?), distance / teff_gaia (GAIA), LDD_est / e_diam_est / CalFlag (JSDC), J/H/Kmag (2MASS), W4mag (WISE), IRflag (MDFC)

Total Rows: 465857
+--------------+-------------+-------------+--------------+-----------+--------+
| column       | mean        | stdDev      | min          | max       | good   |
+--------------+-------------+-------------+--------------+-----------+--------+
| Name         |             |             |              |           | 465857 |
| SpType       |             |             |              |           | 465857 |
| RAJ2000      |             |             |              |           | 465857 |
| DEJ2000      |             |             |              |           | 465857 |
| distance     | 567.73395   | 660.71625   | 1.3011867    | 18404.096 | 460564 |
| teff_midi    | 4384.0073   | 1514.4413   | 3327         | 25645     | 403    |
| teff_gaia    | 6099.031    | 1522.948    | 2241.0       | 64977.0   | 463781 |
| Comp         |             |             |              |           | 4946   |
| mean_sep     | 27.04654    | 96.56517    | -1.0         | 999.9     | 28003  |
| mag1         | 9.140074    | 1.6275178   | -1.47        | 16.2      | 28003  |
| mag2         | 11.752113   | 2.1467986   | 0.18         | 24.6      | 27611  |
| diam_midi    | 3.2239852   | 1.9918951   | 0.92         | 20.398    | 402    |
| e_diam_midi  | 0.016890548 | 0.011792271 | 0.004        | 0.087     | 402    |
| diam_cohen   | 2.7435546   | 1.189701    | 1.59         | 10.03     | 422    |
| e_diam_cohen | 0.036658768 | 0.016024187 | 0.018        | 0.121     | 422    |
| diam_gaia    | 0.17881162  | 0.3026772   | 9.893455E-4  | 43.475067 | 429824 |
| LDD_meas     | 2.67697     | 3.2892106   | 0.225        | 39.759    | 567    |
| e_diam_meas  | 0.34178808  | 4.2889524   | 0.001        | 120.0     | 788    |
| UDD_meas     | 4.4887686   | 16.708101   | 0.05         | 420.0     | 700    |
| band_meas    |             |             |              |           | 781    |
| LDD_est      | 0.18713047  | 0.33603582  | 0.002        | 44.85     | 465605 |
| e_diam_est   | 0.007890246 | 0.029105874 | 0.0          | 2.885     | 465605 |
| UDDL_est     | 0.1849376   | 0.33243763  | 0.002        | 44.459    | 465605 |
| UDDM_est     | 0.18540157  | 0.3329401   | 0.002        | 44.476    | 465605 |
| UDDN_est     | 0.18608275  | 0.33407295  | 0.002        | 44.603    | 465605 |
| Jmag         | 8.347455    | 1.5624622   | -2.989       | 15.447    | 465188 |
| Hmag         | 8.036082    | 1.7315052   | -4.007       | 15.439    | 465857 |
| Kmag         | 7.9406104   | 1.7788626   | -4.378       | 15.638    | 465857 |
| W4mag        | 7.4334354   | 1.5221424   | -6.859       | 10.149    | 465113 |
| CalFlag      | 0.09299205  | 0.55026287  | 0            | 7         | 465857 |
| IRflag       | 1.0827936   | 1.554538    | 0            | 7         | 465857 |
| nb_Lflux     | 1.9706455   | 0.34433657  | 0            | 5         | 465857 |
| med_Lflux    | 1.5254774   | 42.61173    | 1.6299028E-4 | 15720.791 | 465755 |
| disp_Lflux   | 0.30919802  | 15.989581   | 2.0396099E-8 | 7742.2144 | 432072 |
| nb_Mflux     | 1.9757823   | 0.36098757  | 0            | 6         | 465857 |
| med_Mflux    | 0.8892675   | 19.973076   | 8.886453E-5  | 7141.7    | 465757 |
| disp_Mflux   | 0.22529167  | 9.646683    | 5.681251E-10 | 4292.073  | 432130 |
| nb_Nflux     | 2.4884224   | 0.94565606  | 0            | 9         | 465857 |
| med_Nflux    | 0.33924013  | 18.71475    | 9.8775825E-5 | 9278.1    | 465750 |
| disp_Nflux   | 0.0845671   | 4.924083    | 6.325757E-10 | 2386.5    | 433969 |
| Lcorflux_30  | 1.2305617   | 10.585987   | 1.6299028E-4 | 1681.6018 | 465521 |
| Lcorflux_100 | 1.0548543   | 5.1469135   | 1.6299027E-4 | 956.6236  | 465521 |
| Lcorflux_130 | 0.9943006   | 4.3013268   | 1.6299025E-4 | 872.3458  | 465521 |
| Mcorflux_30  | 0.75047773  | 5.9194183   | 8.886453E-5  | 1237.3656 | 465507 |
| Mcorflux_100 | 0.6851532   | 3.415813    | 8.886452E-5  | 521.01306 | 465507 |
| Mcorflux_130 | 0.65762204  | 2.9565108   | 8.886452E-5  | 496.79926 | 465507 |
| Ncorflux_30  | 0.23722996  | 5.3514657   | 9.877582E-5  | 2300.2861 | 465501 |
| Ncorflux_100 | 0.22025737  | 3.0437691   | 9.877579E-5  | 994.1644  | 465501 |
| Ncorflux_130 | 0.21549404  | 2.8604133   | 9.877577E-5  | 994.1628  | 465501 |
+--------------+-------------+-------------+--------------+-----------+--------+

How to xmatch both catalogs ?

As MFDC is derived from JSDC, either coordinates or Name (SIMBAD id) should match against JSDC2 (GD) or latest JSDC.

Catalog All Rows
MDFC 465 857
JSDC2_GD 465 877
JSDC2+ 482 723
JSDC 2020 (raw) 494 401

* Xmatch based on Name (columns SIMBAD x NAME):

  • JSDC2_GD x MDFC: 455 726 matches
  • JSDC2+ x MDFC: 443 413 matches
  • JSDC_2020 x MDFC: 445 491 matches (BEST)

* Xmatch based on coordinates:

  • JSDC2_GD x MDFC (radius = 0.1as): 413 405 matches (same ra/dec ASCC ? no, apparently)
  • JSDC2+ x MDFC (radius = 0.1as): 414 506 matches (same ra/dec ASCC ? no, apparently)
  • JSDC_2020 x MDFC (radius = 0.1as): 425 986 matches (same ra/dec GAIA ? better)
  • JSDC_2020 x MDFC (radius = 0.5as): 448 654 matches (better)
  • JSDC_2020 x MDFC (radius = 1.0as): 450 682 matches (better)

Xmatch on positions has many bad matches: subset 'bad': 38753 rows (9%) abs(LDD_est-ldd)/ldd > 0.01

Test other values against JSDC (distance gaia, LDD_est) to validate xmatch ? OK (same except 10 bad matches)

Conclusion: use xmatch on columns SIMBAD x NAME is the best and safest one. However this identifier relies on the SIMBAD service that is not stable in time : JSDC candidates 2016 vs 2017 vs 2020 have different SIMBAD names (name changed or coordinates). Maybe related to SIMBAD / GAIA DR2 crossmatch that changed 3 millions objects (ra/dec name ?)

MDFC Columns to consider

L, M, N median Fluxes and dispersion (equivalent to stddev ?):

---      nb_Lflux     ? Nber of flux values reported in band L
Jy       med_Lflux    ? Median flux value in band L
Jy       disp_Lflux   ? Dispersion of flux values in band L
---      nb_Mflux     ? Nber of flux values reported in band M
Jy       med_Mflux    ? Median flux value in band M
Jy       disp_Mflux   ? Dispersion of flux values in band M
---      nb_Nflux     ? Nber of flux values reported in band N
Jy       med_Nflux    ? Median flux value in band N
Jy       disp_Nflux   ? Dispersion of flux values in band N
These columns could be stored in extra Lflux_med / e_Lflux_med (error ?), Mflux_med / e_Mflux_med, Nflux_med / e_Nflux_med. Note: numbers of flux values do not really fit in SearchCal / JSDC columns ?

IRflag is specific to MDFC:

IRflag is also a 3-bit flag taking values ranging from "0" to "7":
• bit 1 is set if the star shows an IR excess, identified thanks to
the [K-W4] and [J-H] color indexes, and the overall MIR excess
statistic X MIR computed from Gaia DR1;
• bit 2 is set if the star is extended in the IR, indicated by the ex-
tent flags reported in the WISE/AllWISE and AKARI catalogues;
• bit 3 is set if the star is a likely variable in the MIR, identified
by the variability flags reported in the WISE/AllWISE catalogues,
the MSX6C Infrared Point Source Catalogue, the IRAS PSC, and
the 10-micron Catalog.

JSDC x MDFC match

Only IRFlag column added (xmatch on SIMBAD x NAME columns):

http://jmmc.fr/~bourgesl/sclsvr_JSDC/MDFC/match_jsdc.fits.gz

Total Rows: 462294
+-------------------------------+----------------+----------------+---------------+---------------+----------+
| column                        | mean           | stdDev         | min           | max           | good     |
+-------------------------------+----------------+----------------+---------------+---------------+----------+
| SIMBAD                        |                |                |               |               | 462294   |
| LDD                           | 0.20245042     | 0.41593498     | 0.002724054   | 21.007        | 462294   |
| e_LDD                         | 0.00951694     | 0.0402834      | 1.558098E-4   | 3.524713      | 462294   |
| IRflag                        | 1.0616275      | 1.5379242      | 0             | 7             | 441361   |
+-------------------------------+----------------+----------------+---------------+---------------+----------+

JSDC 2020 generation

* Former JSDC releases are stored at: http://apps.jmmc.fr/~sclcat/JSDC/

TODO:

  • fix new column names (b_ / B_)
  • fix catalog filtering (duplicated GAIA/2MASS or SIMBAD identifiers) + what persistent identifier to use (Name is not reliable) ?
  • merge with MDFC 10 (stilts xmatch on coords or SAFE identifiers like SIMBAD) ? however GAIA positions are not the same as ASCC positions (MDFC) so original ra/dec coordinates should be preserved in SearchCal (input coords) to make the xmatch reliable. NO: MDFC coords are not matching

What's next ?

  • update the candidate catalogs (BRIGHT / FAINT) with more filters (identifiers) (use Simbad B/V mags ?)
  • run the JSDC FAINT scenario (2M)

Stats / Early results:

  • JSDC 2 (jsdc_2017_03_03.fits):
465877 rows

Comparison JSDC2 (GD) vs jsdc_2017_05_03_bright.fits: as these catalogs were done with a different input catalog (ASCC subset) between v2017.3 vs v2017.5, only 445367 rows are in common (SIMBAD / NAME identifiers)

XMatch stats:

MIN_SEP_2ND = min(XM_HIP_sep_2nd,XM_ASCC_sep_2nd,XM_GAIA_sep_2nd,XM_2MASS_sep_2nd,XM_WISE_sep_2nd,5.0)

subset count pct Expr
All 494401 100%
diamFlag 482845 98%
sep2_min_gt_1as 481635 97% MIN_SEP_2ND >= 1.0
sep2_min_gt_2as 473187 96% MIN_SEP_2ND > 2.0
sep2_min_gt_3as 469141 95% MIN_SEP_2ND >= 3.0
sep2_min_gt_5as 456038 92% MIN_SEP_2ND >= 5.0
group_size_eq_0 455524 92% GroupSize == 0

Notes:

  • MIN_SEP_2ND could help marking rows that may have an ambiguous xmatch (ASCC, HIP, GAIA, 2MASS, WISE?) but not SIMBAD !
  • GroupSize == 0 (means no other object within 5as in SIMBAD, ASCC, HIP, GAIA, 2MASS, WISE) based on XM_..._n_mates

Lower radius for crossmatches

According to the cumulative histogram of the separation (as), crossmatch radius can be lowered (1.5 to 2.5 as in 2017) now to match 98% of stars and have less bad identifications. However, there is low probability of bad identification (see the plot of the distance between 1st and 2nd candidate within 5 as):

See jsdc_200909_hist_seps.pdf and jsdc_bright_20200827-sep2nd.pdf

Similarly the JSDC candidate catalog will be updated to match ASCC x SIMBAD with radius = 1.0 as instead of 2 as (lower limit on SIMBAD identification)...

Here is the table of new crossmatch radiuses at 98% and 99% of all matches with radius > 2.0 as:

Catalog Previous radius (as) 98% Radius 2020 (as) Final new radius 2020.28 (as)
Simbad 2.0 0.35 - 0.5 1.0
ASCC 1.5 0.5 1.5
GAIA 1.0 0.3 - 0.4 0.8
HIP 1.5 0.5 - 0.8 1.5
2MASS 2.5 0.45 - 0.7 1.5
WISE 5.0 1.0 - 1.5 3.5

Note: to keep Sirius (2MASS, WISE) in JSDC, radius must be higher than (1.5, 2.5) ! Do not put too low values, check XM_..._sep values in the output catalog (if any filter is needed)

Status on 2020/10/13:

Simbad crossmatch (for candidate catalog)

  • ASCC (full) has 2.5 million rows:
    • TODO: select only rows with TYC3 defined (TYCHO 1/2 + HIP subset, not PPM & other catalogs, less precise on positions, proper motions & Vmag): download from vizier the full catalog to select TYC1 = 0 (2454556 rows) and get all columns (more filtering, below)
    • this subset ensures objects are unique within 0.8 arcsec
    • TODO: fix searchCal code to fix vizier queries (internally) TYC filter too
  • SIMBAD catalog (xmatch) filtering: * select main_type is 'Star' or other_type contains '*': 5492984 rows * V < 16: 3088937 rows * with sp_type: 562150 rows (NULL_sp_type)

Warning: crossmatch on positions (1as) are not very good according to the V mag comparison: bad identification of sources ?

In stilts / topcat, it is possible to use sky+X matcher to compare both positions and V magnitudes. What thresholds to use ? position: 2as (or 1as) and X error: 1 mag (0.5 or 0.3) ?

Note: few incorrect V mags found in SIMBAD (mag = 8 vs 15)

Alternatively, to get SIMBAD sp_type & object types, it is possible to use SIMBAD resolver (by TYCHO id) to get more correct identifications ? but it means 2.5M objects to resolve or only ambiguous matches ...

HIP catalogs crossmatch

SearchCal crossmatch on HIP1/2 seems incorrect (many duplicates) as the radius is too high (1.5) => double stars will have the same B / V magnitudes !

TODO: ignore HIP queries (useless with GAIA astrometry), to avoid any issue on B/V mags (bad identifications)

GAIA crossmatch

Define the consistency check between V & G mags using the polynomial relation (G - V) vs (G_Bp-G_Rp)

BP_RP:  Bp - Rp (gaia Blue / Red filters)

V_est =  G + Pol(BP_RP)

Bp-Rp < 2.5
Pol(BP_RP) = -(0.01760 + 0.006860 * x + 0.1732 * x * x)

Bp-Rp > 2.5: (visual fit) valid up to Bp-Rp < 4.5
Pol(BP_RP) = -(0.28 + 0.132 * x * x)

V_ext = G + ((BP_RP < 2.5) ? (0.01760 + 0.006860 * BP_RP + 0.1732 * BP_RP * BP_RP) : (0.28 + 0.134 * BP_RP * BP_RP))

Criteria to define good matches:
abs(V_est+0.015-V) <= min(0.4, max(e_v, 0.1))*5

offset on V_est: ~ 0.015
use 5 sigma ie (5x)
sigma = e_v (min 0.1, max 0.4) => range is 0.5 to 2.0 mags (quite large) !


Test on JSDC 2020_10_02:
good_G   459948   95%    abs(V_est-V) < 0.25 && gV_est

Improving crossmatch algorithm

Finally SearchCal crossmatch algorithm should be improved to perform 'Best' match (symetric approach) and resolve properly close fields in both catalogs (list_1 X list_2).

For now, it behaves more like topcat's matcher 'Best match for each Table 1 row' i.e. multi-cone search.

-- Bourges Laurent - 2020-09-13

Topic attachments
I Attachment History Action Size Date Who Comment
PDFpdf ASCC_vs_SIMBAD_Vmags.pdf r1 manage 76.2 K 2020-10-13 - 15:56 LaurentBourges ASCC_vs_SIMBAD
PDFpdf ASCC_vs_SIMBAD_Vmags_histo.pdf r1 manage 5.5 K 2020-10-13 - 15:56 LaurentBourges ASCC_SIMBAD_histo
PDFpdf GAIA_col_V-G.pdf r1 manage 39.8 K 2020-10-13 - 15:57 LaurentBourges GAIA_pol
PDFpdf gaia_pol_G-V.pdf r1 manage 77.4 K 2020-11-04 - 08:57 LaurentBourges Gaia G-V fit
PDFpdf gaia_pol_V_diff.pdf r1 manage 64.7 K 2020-11-04 - 08:57 LaurentBourges Gaia Tycho2 V diff
PDFpdf jsdc_200909_hist_seps.pdf r1 manage 15.5 K 2020-09-25 - 15:22 LaurentBourges jsdc separation histogram
PDFpdf jsdc_bright_20200827-sep2nd.pdf r1 manage 10.9 K 2020-09-28 - 10:09 LaurentBourges Separation between 1st and 2nd object: XM_..._sep_2nd
Edit | Attach | Watch | Print version | History: r12 < r11 < r10 < r9 < r8 | Backlinks | Raw View | Raw edit | More topic actions
Topic revision: r12 - 2020-11-04 - LaurentBourges
 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback