Indicators

Elephant provides the Indicator API as a way to generate statistics from entities. The main goal of the API is to facilitate the generation among modules, on separated DBs.

Indicators are usually stored using a separated convenience API. Examples of using indicators are the Ranking and Matching APIs. Both take indicators values and store them where needed, independently of the source of the stored data.

Main goals

Indicator composition

Since indicators are stored using separated APIs, the indicator may vary in its composition. For example, the Ranking API relates to sets of single entities, while the Matching API does with two entities. Provided that storage depends on others and Indicator API needs read access, the API imposes some requirements.

The reading mechanism

Since indicators solely generate and provide data, seems that being able to read this data on a separate process is far off its possibilities.

The reading mechanism comes in help and is able to provide indicator values for statistic purposes. To achieve this goal, indicators find storage classes and provide an specific syntax for reading its values at single, multiple or formulated basis.

Reading syntax

The form taken by readIndicator parameter is:

JPAEntityClass:IndicatorClass:entityPath[:relatedPath]:indicator

JPAEntityClass

These classes are usually named after their functionality. You can see an example in Ranking , where the class DossierRanking has the indicators for dossier's ranking.

IndicatorClass

Class that owns the JPA Dao and has the ability to read values.

entityPath

Refers to the entity to be read. Notice that readIndicator method will always return a SUM of the results. An identifier will return the results of a single entity, while * will return all.

relatedPath

Same as for entityPath but for the related entity. This field has meaning only when reading a matching indicator.

indicator

Refers to which indicator will be returned. For indicator the final word has special meaning. Refers to the resulting formula for the specified JPAEntityClass. Indicators can make use of wildcards to get fine grained statistics. For instance, status* will result with the sum of all indicators starting with status.

Wildcards

*

Represents one or more characters.

?

Represents a single character.

Discovering indicators

To understand how reading syntax helps locating values, this image shows the process of getting values.


Optimizations

Indicators are created from within modules and using JPA contexts. The results are stored at module convenience in order to be accessible for database queries. The Ranking API and the Matching API assist saving results in specialized tables. This would be the standard, and preferred, method.

Before diving into possible process optimizations, let's see some numbers.

Quantifying results

A ranking set of indicators will create (indicators + final_ranking) * entities tuples. For instance, if you have 500 entities and need 10 indicators to calculate the ranking value, this will create (10 + 1) * 500, 5,500 tuples. Not so bad.

A matching set of indicators will create (indicators + final_matching) * entities * related_entities tuples. Things have slightly changed since there is another factor: the related or entities to match with. The matched entities are usually contacts. Let's do the same example as above, saying that we have 1,000 contacts to match with. Applying the factor to the previous result give us the amount of 5,500,000 tuples. Quite impressive, isn't?

Three optimization approaches

Spare zeros is a first approach to optimize the results. The API does not save zero or near to zero values. The impact of this optimization highly depends on the related selection of the second approach, but is at least significant for non relevant indicators.

Fine selection approach affects both entities and related entities. It's difficult to implement since implies some kind of guessing which pairs are prone to match. The impact of this optimization is high because supposedly eliminates zeros and, more important, reduces the number of database reads.

Bulk data might be the more effective approach. Makes an initial selection and does de insertions as a pre-process. When doing so, the indicator does not have to load entities separately, knows the result from the read data.

Formulas

Formulas are mainly composed by variables, constants, operators and functions. See Variables for a full list of available ranking and matching variables.

Elephant Indicator API also adds some functions to facilitate the formula edition.

Using weighted values

Since variables return absolute values, being or not related to a second entity, the result treatment gets complicated. To palliate this, the weighted functions provide a more easy input.

weighted(value, meaning, weight)

Returns a weighted value based on:

reverseWeighted(value, meaning, weight)

Returns a weighted complementary value based on the same parameters that weighted.

Math functions

You can also use math functions to create your own calculations. Math functions are prefixed with Math, for example Math.floor(2.3) will return the parameter floor.

Variables

Variables are used inside formulas to create the final ranking or matching. Descriptions refer to current entity as the entity that's being processed and to self as the related entity for matching, usually a contact, therefore creating the self point of view.

Variables are used in Formulas .

attach
Indicator

Variable Descripció

attachments
attach:attachments
BOTH_VARIABLE

Ranking
Counts the number of attachments in current entity.
Matching
Counts the number of self uploaded attachments in current entity.

commons
Indicator

Variable Descripció

following
commons:following
BOTH_VARIABLE

Ranking
Matching

followed
commons:followed
BOTH_VARIABLE

Ranking
Matching

seen
commons:seen
BOTH_VARIABLE

Ranking
Counts the number of times has been seen.
Matching
Tells whether a user has seen current entity, 1=seen.

like
commons:like
BOTH_VARIABLE

Ranking
Matching

apply
commons:apply
BOTH_VARIABLE

Ranking
Matching

contact
Indicator

Variable Descripció

available
contact:available
RANKING_VARIABLE

Ranking

profile
contact:profile
RANKING_VARIABLE

Ranking

distance
contact:distance
MATCHING_VARIABLE

Matching

dossier
Indicator

Variable Descripció

participation
dossier:participation
BOTH_VARIABLE

Ranking
Counts the number of participations in current entity.
Matching
Counts the number of self participations in current entity.

categoryParticipation
dossier:categoryParticipation
BOTH_VARIABLE

Ranking
Counts the number of participations in current category's entity.
Matching
Counts the number of self participations in current category's entity.

daysCreation
dossier:daysCreation
RANKING_VARIABLE

Ranking
Returns the days passed since current entity's creation.

daysActivity
dossier:daysActivity
RANKING_VARIABLE

Ranking
Returns the days passed since current entity's last activity.

words
dossier:words
RANKING_VARIABLE

Ranking
Counts the number of words used to describe current entity.

issues
dossier:issues
BOTH_VARIABLE

Ranking
Counts the number of issues in current entity.
Matching
Counts the number of self participating issues in current entity.

issueActivity
dossier:issueActivity
BOTH_VARIABLE

Ranking
Counts the number of reported issue activity in current entity.
Matching
Counts the number of self reported issue activity in current entity.

issueResponsible
dossier:issueResponsible
RANKING_VARIABLE

Ranking

issueReporter
dossier:issueReporter
RANKING_VARIABLE

Ranking

issueQA
dossier:issueQA
RANKING_VARIABLE

Ranking

issueAssistant
dossier:issueAssistant
RANKING_VARIABLE

Ranking

siblings
dossier:siblings
MATCHING_VARIABLE

Matching
Counts the number of self siblings' participations in current entity.

siblingsCategory
dossier:siblingsCategory
MATCHING_VARIABLE

Matching
Counts the number of self siblings' participations in current category's entity.

forum
Indicator

Variable Descripció

topics
forum:topics
BOTH_VARIABLE

Ranking
Counts the number of topics in current entity.
Matching
Counts the number of self created topics in current entity.

posts
forum:posts
BOTH_VARIABLE

Ranking
Counts the number of posts in current entity.
Matching
Counts the number of self created posts in current entity.

generic
Indicator

Variable Descripció

stars
generic:stars
BOTH_VARIABLE

Ranking
Averages the number of stars for current entity.
Matching
Averages the number of self given stars for current entity.

comments
generic:comments
BOTH_VARIABLE

Ranking
Counts the number of comments in current entity.
Matching
Counts the number of self comments in current entity.

service
Indicator

Variable Descripció

participation
service:participation
MATCHING_VARIABLE

Matching
Counts the number of self participations in current entity.

daysCreation
service:daysCreation
RANKING_VARIABLE

Ranking
Returns the days passed since current entity's creation.

daysStart
service:daysStart
RANKING_VARIABLE

Ranking

daysEnd
service:daysEnd
RANKING_VARIABLE

Ranking

words
service:words
RANKING_VARIABLE

Ranking
Counts the number of words used to describe current entity.

student
Indicator

Variable Descripció

challenges
student:challenges
RANKING_VARIABLE

Ranking
Counts the number of challenges in current entity.

responses
student:responses
RANKING_VARIABLE

Ranking
Counts the number of responses to challenges in current entity.

responsesLike
student:responsesLike
RANKING_VARIABLE

Ranking

daysCreation
student:daysCreation
RANKING_VARIABLE

Ranking
Returns the days passed since current entity's creation.

distance
student:distance
MATCHING_VARIABLE

Matching