International Federation of Digital Seismograph Networks

Thread: Questions about FDSN web services

None
Started: 2013-01-28 21:11:28
Last activity: 2013-01-29 05:47:54
Doug Neuhauser
2013-01-28 21:11:28
As I look over the definition of the FDSN web services
http://www.fdsn.org/webservices/FDSN-WS-Specifications-1.0.pdf
it is not clear to me whether the parameter specification Table 1 is
defining the valid default parameters for ALL of the web services.

I have 2 basic concerns:
1. A minimal query with no parameters will generate a maximal
length response.
2. The defaults as specified in Table 1 may be very difficult
(or impossible) for some sites to implement.

Specific questions:

1. It appears that the "limit" parameter default is [Any].
Does that mean that any query to any fdsn web service that
does not specify a "limit" parameters should return up to the
site-dependent max limit, or may a site impose its own "default" limit
that it documents in the application-wadl?

2. It appears that none of the parameters are required for any
web service query. The simple-time and time-window defaults are listed as [Any],
but the manual states on page 3:
for example, for a call to the fdsnws-event service the client should
specify a simple-time or time window definition but not both.
Is it valid to specify neither, and in that case, MUST the web service
use the default of [Any] for the time boundaries?

3. fdsn-eventws service:

a.For sites that host multiple event catalogs each in a completely
separate database, the concept of the default query requesting data
from each catalog may not be easily implemented. Can a site impose
its own "default" catalog, or must we either implement the default
of ALL catalogs OR be forced to not implement the fdsn-event web service
at all (or restrict it to only 1 catalog)? Frankly, the idea of a
default event search returning an unassociated list of events for possibly
the "same earthquake" with different locations and magnitudes from different
catalogs seems like a strange default.

c. Implementing the orderby for multiple catalog may again be very
difficult. Can we consider the orderby to be WITHIN a catalog, or must
it be ACROSS catalogs?

d. The updatedafter parameter does not specify whether the hypocenter
and time have been updated or whether ANY parameter or observation
associated with the event has been updated. This may again be very
difficult to implement.

I would like to provide an fdsn-event web service, but the defaults
may make this difficult if not impossible to implement.

Thanks in advance for your feedback.

- Doug N

--
------------------------------------------------------------------------
Doug Neuhauser University of California, Berkeley
doug<at>seismo.berkeley.edu Berkeley Seismological Laboratory
Office: 510-642-0931 215 McCone Hall # 4760
Fax: 510-643-5811 Berkeley, CA 94720-4760
Remote: 530-752-5615 (Wed,Fri)

  • Chad Trabant
    2013-01-29 04:15:51

    Hi Doug,

    Regarding your basic concerns: a data center always has the option of responding with an HTTP 413 status to indicate that too much data has been requested. This limit will most likely be different for each data center implementing the web services. For NCEDC you can, for example, choose to require the network parameter and if it's not supplied you could return a 413 with the message stating that the network code is required for NCEDC. In other words, you are free to choose your own maximum length response in a variety of ways.

    Regrading your specific questions.

    1) It was not the intention for a site to assign a default limit different than a site-dependent maximum. It was the intention that if no limit was set the client would get up to the site-dependent maximum. I'm not sure if such a think can be documented in a WADL.

    2) On page 3 we should probably change "should" to "could" in the sentence you quote to fix the minor discord. It was the intention that if no time boundaries were specified they would not be considered as criteria, in other words no default time boundaries are expected.

    3a) It was the intention that all catalogs be searched when no catalog has been specified. Which makes perfect sense when, like the DMC, the catalogs have been associated with each other. You're correct that if the catalogs were stored and managed separately it may be confusing. Setting and documenting a default catalog sounds perfectly reasonable to me in this case. I would support a change to the specification to clarify that this is allowed and should be expected.

    3b) 42.

    3c) I think if you clearly document how you implement the orderby option it would be OK.

    3d) The expectation is that the data center defines when an event/origin/magnitude/etc. has been updated. Each data center will have a different set of parameters so a rigid definition would not be flexible enough. The intention of this option is to facilitate synchronization of event parameter holdings, so a client can process new events as added at the data center.

    These are just my interpretations, I'd be interested to hear others.

    Chad

    On Jan 28, 2013, at 1:11 PM, Doug Neuhauser <doug<at>seismo.berkeley.edu> wrote:

    As I look over the definition of the FDSN web services
    http://www.fdsn.org/webservices/FDSN-WS-Specifications-1.0.pdf
    it is not clear to me whether the parameter specification Table 1 is
    defining the valid default parameters for ALL of the web services.

    I have 2 basic concerns:
    1. A minimal query with no parameters will generate a maximal
    length response.
    2. The defaults as specified in Table 1 may be very difficult
    (or impossible) for some sites to implement.

    Specific questions:

    1. It appears that the "limit" parameter default is [Any].
    Does that mean that any query to any fdsn web service that
    does not specify a "limit" parameters should return up to the
    site-dependent max limit, or may a site impose its own "default" limit
    that it documents in the application-wadl?

    2. It appears that none of the parameters are required for any
    web service query. The simple-time and time-window defaults are listed as [Any],
    but the manual states on page 3:
    for example, for a call to the fdsnws-event service the client should
    specify a simple-time or time window definition but not both.
    Is it valid to specify neither, and in that case, MUST the web service
    use the default of [Any] for the time boundaries?

    3. fdsn-eventws service:

    a.For sites that host multiple event catalogs each in a completely
    separate database, the concept of the default query requesting data
    from each catalog may not be easily implemented. Can a site impose
    its own "default" catalog, or must we either implement the default
    of ALL catalogs OR be forced to not implement the fdsn-event web service
    at all (or restrict it to only 1 catalog)? Frankly, the idea of a
    default event search returning an unassociated list of events for possibly
    the "same earthquake" with different locations and magnitudes from different
    catalogs seems like a strange default.

    c. Implementing the orderby for multiple catalog may again be very
    difficult. Can we consider the orderby to be WITHIN a catalog, or must
    it be ACROSS catalogs?

    d. The updatedafter parameter does not specify whether the hypocenter
    and time have been updated or whether ANY parameter or observation
    associated with the event has been updated. This may again be very
    difficult to implement.

    I would like to provide an fdsn-event web service, but the defaults
    may make this difficult if not impossible to implement.

    Thanks in advance for your feedback.

    - Doug N

    --
    ------------------------------------------------------------------------
    Doug Neuhauser University of California, Berkeley
    doug<at>seismo.berkeley.edu Berkeley Seismological Laboratory
    Office: 510-642-0931 215 McCone Hall # 4760
    Fax: 510-643-5811 Berkeley, CA 94720-4760
    Remote: 530-752-5615 (Wed,Fri)
    _______________________________________________
    fdsn-wg3-products mailing list
    fdsn-wg3-products<at>iris.washington.edu
    http://www.iris.washington.edu/mailman/listinfo/fdsn-wg3-products



    • Doug Neuhauser
      2013-01-29 05:16:56
      Chad,

      If the DMC collects event information from difference sources
      (or "catalogs" as you call them) and then associated the events
      from the different "catalogs", it seems to me that you have a
      new IRIS catalog with information from multiple contributors
      rather than N catalogs. I guess I don't understand why you
      consider this to be N catalogs rather than 1 catalog.

      a. IRIS determines which origin to return (unless all origins
      are requested).
      b. IRIS determines which magnitude to return (unless all
      magnitudes are requested).
      c. IRIS determines how the events from the different input
      catalogs are associated with each other.

      This seems to me to be the hallmark of a catalog.

      - Doug N

      I guess this On 1/28/13 8:15 PM, Chad Trabant wrote:

      Hi Doug,

      Regarding your basic concerns: a data center always has the option of responding with an HTTP 413 status to indicate that too much data has been requested. This limit will most likely be different for each data center implementing the web services. For NCEDC you can, for example, choose to require the network parameter and if it's not supplied you could return a 413 with the message stating that the network code is required for NCEDC. In other words, you are free to choose your own maximum length response in a variety of ways.

      Regrading your specific questions.

      1) It was not the intention for a site to assign a default limit different than a site-dependent maximum. It was the intention that if no limit was set the client would get up to the site-dependent maximum. I'm not sure if such a think can be documented in a WADL.

      2) On page 3 we should probably change "should" to "could" in the sentence you quote to fix the minor discord. It was the intention that if no time boundaries were specified they would not be considered as criteria, in other words no default time boundaries are expected.

      3a) It was the intention that all catalogs be searched when no catalog has been specified. Which makes perfect sense when, like the DMC, the catalogs have been associated with each other. You're correct that if the catalogs were stored and managed separately it may be confusing. Setting and documenting a default catalog sounds perfectly reasonable to me in this case. I would support a change to the specification to clarify that this is allowed and should be expected.

      3b) 42.

      3c) I think if you clearly document how you implement the orderby option it would be OK.

      3d) The expectation is that the data center defines when an event/origin/magnitude/etc. has been updated. Each data center will have a different set of parameters so a rigid definition would not be flexible enough. The intention of this option is to facilitate synchronization of event parameter holdings, so a client can process new events as added at the data center.

      These are just my interpretations, I'd be interested to hear others.

      Chad

      On Jan 28, 2013, at 1:11 PM, Doug Neuhauser <doug<at>seismo.berkeley.edu> wrote:

      As I look over the definition of the FDSN web services
      http://www.fdsn.org/webservices/FDSN-WS-Specifications-1.0.pdf
      it is not clear to me whether the parameter specification Table 1 is
      defining the valid default parameters for ALL of the web services.

      I have 2 basic concerns:
      1. A minimal query with no parameters will generate a maximal
      length response.
      2. The defaults as specified in Table 1 may be very difficult
      (or impossible) for some sites to implement.

      Specific questions:

      1. It appears that the "limit" parameter default is [Any].
      Does that mean that any query to any fdsn web service that
      does not specify a "limit" parameters should return up to the
      site-dependent max limit, or may a site impose its own "default" limit
      that it documents in the application-wadl?

      2. It appears that none of the parameters are required for any
      web service query. The simple-time and time-window defaults are listed as [Any],
      but the manual states on page 3:
      for example, for a call to the fdsnws-event service the client should
      specify a simple-time or time window definition but not both.
      Is it valid to specify neither, and in that case, MUST the web service
      use the default of [Any] for the time boundaries?

      3. fdsn-eventws service:

      a.For sites that host multiple event catalogs each in a completely
      separate database, the concept of the default query requesting data
      from each catalog may not be easily implemented. Can a site impose
      its own "default" catalog, or must we either implement the default
      of ALL catalogs OR be forced to not implement the fdsn-event web service
      at all (or restrict it to only 1 catalog)? Frankly, the idea of a
      default event search returning an unassociated list of events for possibly
      the "same earthquake" with different locations and magnitudes from different
      catalogs seems like a strange default.

      c. Implementing the orderby for multiple catalog may again be very
      difficult. Can we consider the orderby to be WITHIN a catalog, or must
      it be ACROSS catalogs?

      d. The updatedafter parameter does not specify whether the hypocenter
      and time have been updated or whether ANY parameter or observation
      associated with the event has been updated. This may again be very
      difficult to implement.

      I would like to provide an fdsn-event web service, but the defaults
      may make this difficult if not impossible to implement.

      Thanks in advance for your feedback.

      - Doug N

      --
      ------------------------------------------------------------------------
      Doug Neuhauser University of California, Berkeley
      doug<at>seismo.berkeley.edu Berkeley Seismological Laboratory
      Office: 510-642-0931 215 McCone Hall # 4760
      Fax: 510-643-5811 Berkeley, CA 94720-4760
      Remote: 530-752-5615 (Wed,Fri)
      _______________________________________________
      fdsn-wg3-products mailing list
      fdsn-wg3-products<at>iris.washington.edu
      http://www.iris.washington.edu/mailman/listinfo/fdsn-wg3-products



      --
      Doug Neuhauser University of California, Berkeley
      doug<at>seismo.berkeley.edu Berkeley Seismological Laboratory
      Office: 510-642-0931 215 McCone Hall # 4760
      Fax: 510-643-5811 Berkeley, CA 94720-4760
      Remote: 530-752-5615 (Wed,Fri)

      • Chad Trabant
        2013-01-29 05:47:54

        Hi Doug,

        We prefer to think of what we serve as a meta catalog, the association logic is very simple and with no human review, the determination of a preferred origin and magnitude is not done directly but via a choice of a preferred catalog (and then that catalog's notion of preferred is adopted). No information is "cleaned" in this grouping and a single catalog is easily sub-setted from the data set. The DMC event information is primarily to assist in waveform data selection and to allow access to multiple catalogs at once.

        The fdsnws-event service is designed to be generic for multiple uses at multiple data centers. I'm happy to discuss how we can improve it to that end.

        Perhaps we should move the discussion of how the DMC organizes event parameters off-list.

        Chad

        On Jan 28, 2013, at 9:16 PM, Doug Neuhauser <doug<at>seismo.berkeley.edu> wrote:

        Chad,

        If the DMC collects event information from difference sources
        (or "catalogs" as you call them) and then associated the events
        from the different "catalogs", it seems to me that you have a
        new IRIS catalog with information from multiple contributors
        rather than N catalogs. I guess I don't understand why you
        consider this to be N catalogs rather than 1 catalog.

        a. IRIS determines which origin to return (unless all origins
        are requested).
        b. IRIS determines which magnitude to return (unless all
        magnitudes are requested).
        c. IRIS determines how the events from the different input
        catalogs are associated with each other.

        This seems to me to be the hallmark of a catalog.

        - Doug N

        I guess this On 1/28/13 8:15 PM, Chad Trabant wrote:

        Hi Doug,

        Regarding your basic concerns: a data center always has the option of responding with an HTTP 413 status to indicate that too much data has been requested. This limit will most likely be different for each data center implementing the web services. For NCEDC you can, for example, choose to require the network parameter and if it's not supplied you could return a 413 with the message stating that the network code is required for NCEDC. In other words, you are free to choose your own maximum length response in a variety of ways.

        Regrading your specific questions.

        1) It was not the intention for a site to assign a default limit different than a site-dependent maximum. It was the intention that if no limit was set the client would get up to the site-dependent maximum. I'm not sure if such a think can be documented in a WADL.

        2) On page 3 we should probably change "should" to "could" in the sentence you quote to fix the minor discord. It was the intention that if no time boundaries were specified they would not be considered as criteria, in other words no default time boundaries are expected.

        3a) It was the intention that all catalogs be searched when no catalog has been specified. Which makes perfect sense when, like the DMC, the catalogs have been associated with each other. You're correct that if the catalogs were stored and managed separately it may be confusing. Setting and documenting a default catalog sounds perfectly reasonable to me in this case. I would support a change to the specification to clarify that this is allowed and should be expected.

        3b) 42.

        3c) I think if you clearly document how you implement the orderby option it would be OK.

        3d) The expectation is that the data center defines when an event/origin/magnitude/etc. has been updated. Each data center will have a different set of parameters so a rigid definition would not be flexible enough. The intention of this option is to facilitate synchronization of event parameter holdings, so a client can process new events as added at the data center.

        These are just my interpretations, I'd be interested to hear others.

        Chad

        On Jan 28, 2013, at 1:11 PM, Doug Neuhauser <doug<at>seismo.berkeley.edu> wrote:

        As I look over the definition of the FDSN web services
        http://www.fdsn.org/webservices/FDSN-WS-Specifications-1.0.pdf
        it is not clear to me whether the parameter specification Table 1 is
        defining the valid default parameters for ALL of the web services.

        I have 2 basic concerns:
        1. A minimal query with no parameters will generate a maximal
        length response.
        2. The defaults as specified in Table 1 may be very difficult
        (or impossible) for some sites to implement.

        Specific questions:

        1. It appears that the "limit" parameter default is [Any].
        Does that mean that any query to any fdsn web service that
        does not specify a "limit" parameters should return up to the
        site-dependent max limit, or may a site impose its own "default" limit
        that it documents in the application-wadl?

        2. It appears that none of the parameters are required for any
        web service query. The simple-time and time-window defaults are listed as [Any],
        but the manual states on page 3:
        for example, for a call to the fdsnws-event service the client should
        specify a simple-time or time window definition but not both.
        Is it valid to specify neither, and in that case, MUST the web service
        use the default of [Any] for the time boundaries?

        3. fdsn-eventws service:

        a.For sites that host multiple event catalogs each in a completely
        separate database, the concept of the default query requesting data
        from each catalog may not be easily implemented. Can a site impose
        its own "default" catalog, or must we either implement the default
        of ALL catalogs OR be forced to not implement the fdsn-event web service
        at all (or restrict it to only 1 catalog)? Frankly, the idea of a
        default event search returning an unassociated list of events for possibly
        the "same earthquake" with different locations and magnitudes from different
        catalogs seems like a strange default.

        c. Implementing the orderby for multiple catalog may again be very
        difficult. Can we consider the orderby to be WITHIN a catalog, or must
        it be ACROSS catalogs?

        d. The updatedafter parameter does not specify whether the hypocenter
        and time have been updated or whether ANY parameter or observation
        associated with the event has been updated. This may again be very
        difficult to implement.

        I would like to provide an fdsn-event web service, but the defaults
        may make this difficult if not impossible to implement.

        Thanks in advance for your feedback.

        - Doug N

        --
        ------------------------------------------------------------------------
        Doug Neuhauser University of California, Berkeley
        doug<at>seismo.berkeley.edu Berkeley Seismological Laboratory
        Office: 510-642-0931 215 McCone Hall # 4760
        Fax: 510-643-5811 Berkeley, CA 94720-4760
        Remote: 530-752-5615 (Wed,Fri)
        _______________________________________________
        fdsn-wg3-products mailing list
        fdsn-wg3-products<at>iris.washington.edu
        http://www.iris.washington.edu/mailman/listinfo/fdsn-wg3-products



        --
        Doug Neuhauser University of California, Berkeley
        doug<at>seismo.berkeley.edu Berkeley Seismological Laboratory
        Office: 510-642-0931 215 McCone Hall # 4760
        Fax: 510-643-5811 Berkeley, CA 94720-4760
        Remote: 530-752-5615 (Wed,Fri)