Elephant provides the Indicator API as a way to generate statistics from entities. The main goal of the API is to facilitate the generation among modules, on separated DBs.
Indicators are usually stored using a separated convenience API. Examples of using indicators are the Ranking and Matching APIs. Both take indicators values and store them where needed, independently of the source of the stored data.
Since indicators are stored using separated APIs, the indicator may vary in its composition. For example, the Ranking API relates to sets of single entities, while the Matching API does with two entities. Provided that storage depends on others and Indicator API needs read access, the API imposes some requirements.
Since indicators solely generate and provide data, seems that being able to read this data on a separate process is far off its possibilities.
The reading mechanism comes in help and is able to provide indicator values for statistic purposes. To achieve this goal, indicators find storage classes and provide an specific syntax for reading its values at single, multiple or formulated basis.
The form taken by readIndicator
parameter is:
JPAEntityClass:IndicatorClass:entityPath[:relatedPath]:indicator
JPAEntityClass |
These classes are usually named after their functionality. You can see an example in Ranking
, where the class |
IndicatorClass |
Class that owns the JPA Dao and has the ability to read values. |
entityPath |
Refers to the entity to be read. Notice that |
relatedPath |
Same as for entityPath but for the related entity. This field has meaning only when reading a matching indicator. |
indicator |
Refers to which indicator will be returned. For indicator the |
* |
Represents one or more characters. |
? |
Represents a single character. |
To understand how reading syntax helps locating values, this image shows the process of getting values.
Indicators are created from within modules and using JPA contexts. The results are stored at module convenience in order to be accessible for database queries. The Ranking API and the Matching API assist saving results in specialized tables. This would be the standard, and preferred, method.
Before diving into possible process optimizations, let's see some numbers.
A ranking set of indicators will create (indicators + final_ranking) * entities
tuples. For instance, if you have 500 entities and need 10 indicators to calculate the ranking value, this will create (10 + 1) * 500
, 5,500 tuples. Not so bad.
A matching set of indicators will create (indicators + final_matching) * entities * related_entities
tuples. Things have slightly changed since there is another factor: the related or entities to match with. The matched entities are usually contacts. Let's do the same example as above, saying that we have 1,000 contacts to match with. Applying the factor to the previous result give us the amount of 5,500,000 tuples. Quite impressive, isn't?
Spare zeros is a first approach to optimize the results. The API does not save zero or near to zero values. The impact of this optimization highly depends on the related selection of the second approach, but is at least significant for non relevant indicators.
Fine selection approach affects both entities and related entities. It's difficult to implement since implies some kind of guessing which pairs are prone to match. The impact of this optimization is high because supposedly eliminates zeros and, more important, reduces the number of database reads.
Bulk data might be the more effective approach. Makes an initial selection and does de insertions as a pre-process. When doing so, the indicator does not have to load entities separately, knows the result from the read data.
Formulas are mainly composed by variables, constants, operators and functions. See Variables for a full list of available ranking and matching variables.
Elephant Indicator API also adds some functions to facilitate the formula edition.
Since variables return absolute values, being or not related to a second entity, the result treatment gets complicated. To palliate this, the weighted functions provide a more easy input.
weighted(value, meaning, weight)
Returns a weighted value based on:
reverseWeighted(value, meaning, weight)
Returns a weighted complementary value based on the same parameters that weighted
.
You can also use math functions to create your own calculations. Math functions are prefixed with Math
, for example Math.floor(2.3)
will return the parameter floor.
Variables are used inside formulas to create the final ranking or matching. Descriptions refer to current entity as the entity that's being processed and to self as the related entity for matching, usually a contact, therefore creating the self point of view.
Variables are used in Formulas .
Variable | Descripció |
---|---|
attachments
BOTH_VARIABLE |
Ranking
Counts the number of attachments in current entity.
Counts the number of self uploaded attachments in current entity.
|
Variable | Descripció |
---|---|
following
BOTH_VARIABLE |
|
followed
BOTH_VARIABLE |
|
seen
BOTH_VARIABLE |
Counts the number of times has been seen.
Tells whether a user has seen current entity, 1=seen.
|
like
BOTH_VARIABLE |
|
apply
BOTH_VARIABLE |
|
Variable | Descripció |
---|---|
available
RANKING_VARIABLE |
|
profile
RANKING_VARIABLE |
|
distance
MATCHING_VARIABLE |
|
Variable | Descripció |
---|---|
participation
BOTH_VARIABLE |
Counts the number of participations in current entity.
Counts the number of self participations in current entity.
|
categoryParticipation
BOTH_VARIABLE |
Counts the number of participations in current category's entity.
Counts the number of self participations in current category's entity.
|
daysCreation
RANKING_VARIABLE |
Returns the days passed since current entity's creation.
|
daysActivity
RANKING_VARIABLE |
Returns the days passed since current entity's last activity.
|
words
RANKING_VARIABLE |
Counts the number of words used to describe current entity.
|
issues
BOTH_VARIABLE |
Counts the number of issues in current entity.
Counts the number of self participating issues in current entity.
|
issueActivity
BOTH_VARIABLE |
Counts the number of reported issue activity in current entity.
Counts the number of self reported issue activity in current entity.
|
issueResponsible
RANKING_VARIABLE |
|
issueReporter
RANKING_VARIABLE |
|
issueQA
RANKING_VARIABLE |
|
issueAssistant
RANKING_VARIABLE |
|
siblings
MATCHING_VARIABLE |
Counts the number of self siblings' participations in current entity.
|
siblingsCategory
MATCHING_VARIABLE |
Counts the number of self siblings' participations in current category's entity.
|
Variable | Descripció |
---|---|
topics
BOTH_VARIABLE |
Counts the number of topics in current entity.
Counts the number of self created topics in current entity.
|
posts
BOTH_VARIABLE |
Counts the number of posts in current entity.
Counts the number of self created posts in current entity.
|
Variable | Descripció |
---|---|
stars
BOTH_VARIABLE |
Averages the number of stars for current entity.
Averages the number of self given stars for current entity.
|
comments
BOTH_VARIABLE |
Counts the number of comments in current entity.
Counts the number of self comments in current entity.
|
Variable | Descripció |
---|---|
participation
MATCHING_VARIABLE |
Counts the number of self participations in current entity.
|
daysCreation
RANKING_VARIABLE |
Returns the days passed since current entity's creation.
|
daysStart
RANKING_VARIABLE |
|
daysEnd
RANKING_VARIABLE |
|
words
RANKING_VARIABLE |
Counts the number of words used to describe current entity.
|
Variable | Descripció |
---|---|
challenges
RANKING_VARIABLE |
Counts the number of challenges in current entity.
|
responses
RANKING_VARIABLE |
Counts the number of responses to challenges in current entity.
|
responsesLike
RANKING_VARIABLE |
|
daysCreation
RANKING_VARIABLE |
Returns the days passed since current entity's creation.
|
distance
MATCHING_VARIABLE |
|