Exposed SearchInfo, closes #121

This commit is contained in:
Rhet Turnbull 2020-12-26 08:08:18 -08:00
parent 9ca5d8f0fd
commit 4ece5c0d1c
11 changed files with 447 additions and 36 deletions

View File

@ -21,6 +21,7 @@
+ [FolderInfo](#folderinfo)
+ [PlaceInfo](#placeinfo)
+ [ScoreInfo](#scoreinfo)
+ [SearchInfo](#searchinfo)
+ [PersonInfo](#personinfo)
+ [FaceInfo](#faceinfo)
+ [CommentInfo](#commentinfo)
@ -1470,6 +1471,13 @@ Returns image categorization labels associated with the photo as list of str.
#### `labels_normalized`
Returns image categorization labels associated with the photo as list of str. Labels are normalized (e.g. converted to lower case). Use of normalized strings makes it easier to search if you don't how Apple capitalizes a label. For example:
#### <a name="photosearchinfo">`search_info`</a>
Returns [SearchInfo](#searchinfo) object that represents search metadata for the photo.
#### <a name="photosearchinfo-normalized">`search_info_normalized`</a>
Returns [SearchInfo](#searchinfo) object that represents normalized search metadata for the photo. This returns a SearchInfo object just as `search_info` but all the properties of the object return normalized text (converted to lowercase).
```python
import osxphotos
@ -1872,7 +1880,7 @@ PostalAddress(street='3700 Wailea Alanui Dr', sub_locality=None, city='Kihei', s
'96753'
```
### ScoreInfo
[PhotoInfo.score](#score) returns a ScoreInfo object that exposes the computed aesthetic scores for each photo (**Photos 5 only**). I have not yet reverse engineered the meaning of each score. The `overall` score seems to the most useful and appears to be a composite of the other scores. The following score properties are currently available:
[PhotoInfo.score](#score) returns a ScoreInfo object that exposes the computed aesthetic scores for each photo (**Photos 5+ only**). I have not yet reverse engineered the meaning of each score. The `overall` score seems to the most useful and appears to be a composite of the other scores. The following score properties are currently available:
```python
overall: float
@ -1911,6 +1919,71 @@ Example: find your "best" photo of food
>>> best_food_photo = sorted([p for p in photos if "food" in p.labels_normalized], key=lambda p: p.score.overall, reverse=True)[0]
```
### SearchInfo
[PhotoInfo.search_info](#photosearchinfo) and [PhotoInfo.search_info_normalized](#photosearchinfo-normalized) return a SearchInfo object that exposes various metadata that Photos uses when searching for photos such as labels, associated holiday, etc. (**Photos 5+ only**).
The following properties are available:
#### `labels`
Returns list of labels applied to photo by Photos image categorization algorithms.
#### `place_names`
Returns list of place names associated with the photo.
#### `streets`
Returns list of street names associated with the photo. (e.g. reverse geolocation of where the photo was taken)
#### `neighborhoods`
Returns list of neighborhood names associated with the photo.
#### `locality_names`
Returns list of locality names associated with the photo.
#### `city`
Returns str of city/town/municipality associated with the photo.
#### `state`
Returns str of state name associated with the photo.
#### `state_abbreviation`
Returns str of state abbreviation associated with the photo.
#### `country`
Returns str of country name associated with the photo.
#### `month`
Returns str of month name associated witht the photo (e.g. month in which the photo was taken)
#### `year`
Returns year associated with the photo.
#### `bodies_of_water`
Returns list of bodies of water associated with the photo.
#### `holidays`
Returns list of holiday names associated with the photo.
#### `activities`
Returns list of activities associated with the photo.
#### `season`
Returns str of season name associated with the photo.
#### `venues`
Returns list of venue names associated with the photo.
#### `venue_types`
Returns list of venue types associated with the photoo.
#### `media_types`
Returns list of media types associated with the photo.
#### `all`
Returns all search_info properties as a single list of strings.
#### `asdict()`
Returns all associated search_info metadata as a dict.
### PersonInfo
[PhotosDB.person_info](#dbpersoninfo) and [PhotoInfo.person_info](#photopersoninfo) return a list of PersonInfo objects represents persons in the database and in a photo, respectively. The PersonInfo class has the following properties and methods.

View File

@ -102,6 +102,63 @@ _OSXPHOTOS_NONE_SENTINEL = "OSXPhotosXYZZY42_Sentinel$"
# SearchInfo categories for Photos 5, corresponds to categories in database/search/psi.sqlite
SEARCH_CATEGORY_LABEL = 2024
SEARCH_CATEGORY_PLACE_NAME = 1
SEARCH_CATEGORY_STREET = 2
SEARCH_CATEGORY_NEIGHBORHOOD = 3
SEARCH_CATEGORY_LOCALITY_4 = 4
SEARCH_CATEGORY_CITY = 5
SEARCH_CATEGORY_SUB_LOCALITY = 6
SEARCH_CATEGORY_LOCALITY_7 = 7
SEARCH_CATEGORY_LOCALITY_8 = 8
SEARCH_CATEGORY_NAMED_AREA = 9
SEARCH_CATEGORY_ALL_LOCALITY = [
SEARCH_CATEGORY_LOCALITY_4,
SEARCH_CATEGORY_SUB_LOCALITY,
SEARCH_CATEGORY_LOCALITY_7,
SEARCH_CATEGORY_LOCALITY_8,
SEARCH_CATEGORY_NAMED_AREA,
]
SEARCH_CATEGORY_STATE = 10
SEARCH_CATEGORY_STATE_ABBREVIATION = 11
SEARCH_CATEGORY_COUNTRY = 12
SEARCH_CATEGORY_BODY_OF_WATER = 14
SEARCH_CATEGORY_MONTH = 1014
SEARCH_CATEGORY_YEAR = 1015
SEARCH_CATEGORY_KEYWORDS = 2016
SEARCH_CATEGORY_TITLE = 2017
SEARCH_CATEGORY_DESCRIPTION = 2018
SEARCH_CATEGORY_HOME = 2020
SEARCH_CATEGORY_PERSON = 2021
SEARCH_CATEGORY_ACTIVITY = 2027
SEARCH_CATEGORY_HOLIDAY = 2029
SEARCH_CATEGORY_SEASON = 2030
SEARCH_CATEGORY_WORK = 2036
SEARCH_CATEGORY_VENUE = 2038
SEARCH_CATEGORY_VENUE_TYPE = 2039
SEARCH_CATEGORY_PHOTO_TYPE_VIDEO = 2044
SEARCH_CATEGORY_PHOTO_TYPE_SLOMO = 2045
SEARCH_CATEGORY_PHOTO_TYPE_LIVE = 2046
SEARCH_CATEGORY_PHOTO_TYPE_SCREENSHOT = 2047
SEARCH_CATEGORY_PHOTO_TYPE_PANORAMA = 2048
SEARCH_CATEGORY_PHOTO_TYPE_TIMELAPSE = 2049
SEARCH_CATEGORY_PHOTO_TYPE_BURSTS = 2052
SEARCH_CATEGORY_PHOTO_TYPE_PORTRAIT = 2053
SEARCH_CATEGORY_PHOTO_TYPE_SELFIES = 2054
SEARCH_CATEGORY_PHOTO_TYPE_FAVORITES = 2055
SEARCH_CATEGORY_MEDIA_TYPES = [
SEARCH_CATEGORY_PHOTO_TYPE_VIDEO,
SEARCH_CATEGORY_PHOTO_TYPE_SLOMO,
SEARCH_CATEGORY_PHOTO_TYPE_LIVE,
SEARCH_CATEGORY_PHOTO_TYPE_SCREENSHOT,
SEARCH_CATEGORY_PHOTO_TYPE_PANORAMA,
SEARCH_CATEGORY_PHOTO_TYPE_TIMELAPSE,
SEARCH_CATEGORY_PHOTO_TYPE_BURSTS,
SEARCH_CATEGORY_PHOTO_TYPE_PORTRAIT,
SEARCH_CATEGORY_PHOTO_TYPE_SELFIES,
SEARCH_CATEGORY_PHOTO_TYPE_FAVORITES,
]
SEARCH_CATEGORY_PHOTO_NAME = 2056
# Max filename length on MacOS
MAX_FILENAME_LEN = 255
@ -119,5 +176,5 @@ DEFAULT_EDITED_SUFFIX = "_edited"
DEFAULT_ORIGINAL_SUFFIX = ""
# Colors for print CLI messages
CLI_COLOR_ERROR = 'red'
CLI_COLOR_WARNING = 'yellow'
CLI_COLOR_ERROR = "red"
CLI_COLOR_WARNING = "yellow"

View File

@ -1,5 +1,5 @@
""" version info """
__version__ = "0.38.10"
__version__ = "0.38.11"

View File

@ -1,11 +1,32 @@
""" Methods and class for PhotoInfo exposing SearchInfo data such as labels
Adds the following properties to PhotoInfo (valid only for Photos 5):
search_info: returns a SearchInfo object
search_info_normalized: returns a SearchInfo object with properties that produce normalized results
labels: returns list of labels
labels_normalized: returns list of normalized labels
"""
from .._constants import _PHOTOS_4_VERSION, SEARCH_CATEGORY_LABEL
from .._constants import (
_PHOTOS_4_VERSION,
SEARCH_CATEGORY_CITY,
SEARCH_CATEGORY_LABEL,
SEARCH_CATEGORY_NEIGHBORHOOD,
SEARCH_CATEGORY_PLACE_NAME,
SEARCH_CATEGORY_STREET,
SEARCH_CATEGORY_ALL_LOCALITY,
SEARCH_CATEGORY_COUNTRY,
SEARCH_CATEGORY_STATE,
SEARCH_CATEGORY_STATE_ABBREVIATION,
SEARCH_CATEGORY_BODY_OF_WATER,
SEARCH_CATEGORY_MONTH,
SEARCH_CATEGORY_YEAR,
SEARCH_CATEGORY_HOLIDAY,
SEARCH_CATEGORY_ACTIVITY,
SEARCH_CATEGORY_SEASON,
SEARCH_CATEGORY_VENUE,
SEARCH_CATEGORY_VENUE_TYPE,
SEARCH_CATEGORY_MEDIA_TYPES,
)
@property
@ -24,6 +45,22 @@ def search_info(self):
return self._search_info
@property
def search_info_normalized(self):
""" returns SearchInfo object for photo that produces normalized results
only valid on Photos 5, on older libraries, returns None
"""
if self._db._db_version <= _PHOTOS_4_VERSION:
return None
# memoize SearchInfo object
try:
return self._search_info_normalized
except AttributeError:
self._search_info_normalized = SearchInfo(self, normalized=True)
return self._search_info_normalized
@property
def labels(self):
""" returns list of labels applied to photo by Photos image categorization
@ -43,14 +80,15 @@ def labels_normalized(self):
if self._db._db_version <= _PHOTOS_4_VERSION:
return []
return self.search_info.labels_normalized
return self.search_info_normalized.labels
class SearchInfo:
""" Info about search terms such as machine learning labels that Photos knows about a photo """
def __init__(self, photo):
""" photo: PhotoInfo object """
def __init__(self, photo, normalized=False):
""" photo: PhotoInfo object
normalized: if True, all properties return normalized (lower case) results """
if photo._db._db_version <= _PHOTOS_4_VERSION:
raise NotImplementedError(
@ -58,6 +96,7 @@ class SearchInfo:
)
self._photo = photo
self._normalized = normalized
self.uuid = photo.uuid
try:
# get search info for this UUID
@ -69,25 +108,170 @@ class SearchInfo:
@property
def labels(self):
""" return list of labels associated with Photo """
if self._db_searchinfo:
labels = [
rec["content_string"]
for rec in self._db_searchinfo
if rec["category"] == SEARCH_CATEGORY_LABEL
]
else:
labels = []
return labels
return self._get_text_for_category(SEARCH_CATEGORY_LABEL)
@property
def labels_normalized(self):
""" return list of normalized labels associated with Photo """
def place_names(self):
""" returns list of place names """
return self._get_text_for_category(SEARCH_CATEGORY_PLACE_NAME)
@property
def streets(self):
""" returns list of street names """
return self._get_text_for_category(SEARCH_CATEGORY_STREET)
@property
def neighborhoods(self):
""" returns list of neighborhoods """
return self._get_text_for_category(SEARCH_CATEGORY_NEIGHBORHOOD)
@property
def locality_names(self):
""" returns list of other locality names """
locality = []
for category in SEARCH_CATEGORY_ALL_LOCALITY:
locality += self._get_text_for_category(category)
return locality
@property
def city(self):
""" returns city/town """
city = self._get_text_for_category(SEARCH_CATEGORY_CITY)
return city[0] if city else ""
@property
def state(self):
""" returns state name """
state = self._get_text_for_category(SEARCH_CATEGORY_STATE)
return state[0] if state else ""
@property
def state_abbreviation(self):
""" returns state abbreviation """
abbrev = self._get_text_for_category(SEARCH_CATEGORY_STATE_ABBREVIATION)
return abbrev[0] if abbrev else ""
@property
def country(self):
""" returns country name """
country = self._get_text_for_category(SEARCH_CATEGORY_COUNTRY)
return country[0] if country else ""
@property
def month(self):
""" returns month name """
month = self._get_text_for_category(SEARCH_CATEGORY_MONTH)
return month[0] if month else ""
@property
def year(self):
""" returns year """
year = self._get_text_for_category(SEARCH_CATEGORY_YEAR)
return year[0] if year else ""
@property
def bodies_of_water(self):
""" returns list of body of water names """
return self._get_text_for_category(SEARCH_CATEGORY_BODY_OF_WATER)
@property
def holidays(self):
""" returns list of holiday names """
return self._get_text_for_category(SEARCH_CATEGORY_HOLIDAY)
@property
def activities(self):
""" returns list of activity names """
return self._get_text_for_category(SEARCH_CATEGORY_ACTIVITY)
@property
def season(self):
""" returns season name """
season = self._get_text_for_category(SEARCH_CATEGORY_SEASON)
return season[0] if season else ""
@property
def venues(self):
""" returns list of venue names """
return self._get_text_for_category(SEARCH_CATEGORY_VENUE)
@property
def venue_types(self):
""" returns list of venue types """
return self._get_text_for_category(SEARCH_CATEGORY_VENUE_TYPE)
@property
def media_types(self):
""" returns list of media types (photo, video, panorama, etc) """
types = []
for category in SEARCH_CATEGORY_MEDIA_TYPES:
types += self._get_text_for_category(category)
return types
@property
def all(self):
""" return all search info properties in a single list """
all = (
self.labels
+ self.place_names
+ self.streets
+ self.neighborhoods
+ self.locality_names
+ self.bodies_of_water
+ self.holidays
+ self.activities
+ self.venues
+ self.venue_types
+ self.media_types
)
if self.city:
all += [self.city]
if self.state:
all += [self.state]
if self.state_abbreviation:
all += [self.state_abbreviation]
if self.country:
all += [self.country]
if self.month:
all += [self.month]
if self.year:
all += [self.year]
if self.season:
all += [self.season]
return all
def asdict(self):
""" return dict of search info """
return {
"labels": self.labels,
"place_names": self.place_names,
"streets": self.streets,
"neighborhoods": self.neighborhoods,
"city": self.city,
"locality_names": self.locality_names,
"state": self.state,
"state_abbreviation": self.state_abbreviation,
"country": self.country,
"bodies_of_water": self.bodies_of_water,
"month": self.month,
"year": self.year,
"holidays": self.holidays,
"activities": self.activities,
"season": self.season,
"venues": self.venues,
"venue_types": self.venue_types,
"media_types": self.media_types,
}
def _get_text_for_category(self, category):
""" return list of text for a specified category ID """
if self._db_searchinfo:
labels = [
rec["normalized_string"]
content = "normalized_string" if self._normalized else "content_string"
return [
rec[content]
for rec in self._db_searchinfo
if rec["category"] == SEARCH_CATEGORY_LABEL
if rec["category"] == category
]
else:
labels = []
return labels
return []

View File

@ -43,6 +43,7 @@ class PhotoInfo:
# import additional methods
from ._photoinfo_searchinfo import (
search_info,
search_info_normalized,
labels,
labels_normalized,
SearchInfo,
@ -980,6 +981,7 @@ class PhotoInfo:
comments = [comment.asdict() for comment in self.comments]
likes = [like.asdict() for like in self.likes]
faces = [face.asdict() for face in self.face_info]
search_info = self.search_info.asdict() if self.search_info else {}
return {
"library": self._db._library_path,
@ -1041,6 +1043,7 @@ class PhotoInfo:
"original_filesize": self.original_filesize,
"comments": comments,
"likes": likes,
"search_info": search_info,
}
def json(self):

View File

@ -104,17 +104,19 @@ def _process_searchinfo(self):
for row in c:
uuid = ints_to_uuid(row[1], row[2])
# strings have null character appended, so strip it
record = {}
record["uuid"] = uuid
record["rowid"] = row[0]
record["uuid_0"] = row[1]
record["uuid_1"] = row[2]
record["groupid"] = row[3]
record["category"] = row[4]
record["owning_groupid"] = row[5]
record["content_string"] = normalize_unicode(row[6].replace("\x00", ""))
record = {
"uuid": uuid,
"rowid": row[0],
"uuid_0": row[1],
"uuid_1": row[2],
"groupid": row[3],
"category": row[4],
"owning_groupid": row[5],
"content_string": normalize_unicode(row[6].replace("\x00", "")),
}
record["normalized_string"] = normalize_unicode(row[7].replace("\x00", ""))
record["lookup_identifier"] = row[8]
record["lookup_identifier"] = normalize_unicode(row[8].replace("\x00", ""))
try:
_db_searchinfo_uuid[uuid].append(record)

File diff suppressed because one or more lines are too long

View File

@ -210,7 +210,7 @@ def test_search_info(photosdb):
def test_labels_normalized(photosdb):
for uuid in LABELS_NORMALIZED_DICT:
photo = photosdb.photos(uuid=[uuid])[0]
assert sorted(photo.search_info.labels_normalized) == sorted(
assert sorted(photo.search_info_normalized.labels) == sorted(
LABELS_NORMALIZED_DICT[uuid]
)
assert sorted(photo.labels_normalized) == sorted(LABELS_NORMALIZED_DICT[uuid])

View File

@ -349,7 +349,7 @@ def test_labels_normalized(photosdb):
for uuid in LABELS_NORMALIZED_DICT:
photo = photosdb.photos(uuid=[uuid])[0]
logging.warning(f"uuid = {uuid}")
assert sorted(photo.search_info.labels_normalized) == sorted(
assert sorted(photo.search_info_normalized.labels) == sorted(
LABELS_NORMALIZED_DICT[uuid]
)
assert sorted(photo.labels_normalized) == sorted(LABELS_NORMALIZED_DICT[uuid])

View File

@ -0,0 +1,57 @@
""" test SearchInfo class """
import json
import os
import pytest
import osxphotos
# These tests must be run against the author's personal photo library
skip_test = "OSXPHOTOS_TEST_EXPORT" not in os.environ
pytestmark = pytest.mark.skipif(
skip_test, reason="These tests only run against system Photos library"
)
PHOTOS_DB = "/Users/rhet/Pictures/Photos Library.photoslibrary"
with open("tests/search_info_test_data_10_15_7.json") as fp:
test_data = json.load(fp)
UUID_SEARCH_INFO = test_data["UUID_SEARCH_INFO"]
UUID_SEARCH_INFO_NORMALIZED = test_data["UUID_SEARCH_INFO_NORMALIZED"]
UUID_SEARCH_INFO_ALL = test_data["UUID_SEARCH_INFO_ALL"]
UUID_SEARCH_INFO_ALL_NORMALIZED = test_data["UUID_SEARCH_INFO_ALL_NORMALIZED"]
@pytest.fixture(scope="module")
def photosdb():
return osxphotos.PhotosDB(dbfile=PHOTOS_DB)
def test_search_info(photosdb):
for uuid in UUID_SEARCH_INFO:
photo = photosdb.get_photo(uuid)
assert photo.search_info.asdict() == UUID_SEARCH_INFO[uuid]
def test_search_info_normalized(photosdb):
for uuid in UUID_SEARCH_INFO_NORMALIZED:
photo = photosdb.get_photo(uuid)
assert (
photo.search_info_normalized.asdict() == UUID_SEARCH_INFO_NORMALIZED[uuid]
)
def test_search_info_all(photosdb):
for uuid in UUID_SEARCH_INFO_ALL:
photo = photosdb.get_photo(uuid)
assert sorted(photo.search_info.all) == sorted(UUID_SEARCH_INFO_ALL[uuid])
def test_search_info_all_normalized(photosdb):
for uuid in UUID_SEARCH_INFO_ALL_NORMALIZED:
photo = photosdb.get_photo(uuid)
assert sorted(photo.search_info_normalized.all) == sorted(
UUID_SEARCH_INFO_ALL_NORMALIZED[uuid]
)

View File

@ -0,0 +1,34 @@
""" Create the test data needed for test_search_info_10_15_7.py """
# reads data from the author's system photo library to build the test data
# used to test SearchInfo
import json
import osxphotos
UUID = [
"C8EAF50A-D891-4E0C-8086-C417E1284153",
"71DFB4C3-E868-4BE4-906E-D96BD8692D7E",
"2C151013-5BBA-4D00-B70F-1C9420418B86",
]
data = {
"UUID_SEARCH_INFO": {},
"UUID_SEARCH_INFO_NORMALIZED": {},
"UUID_SEARCH_INFO_ALL": {},
"UUID_SEARCH_INFO_ALL_NORMALIZED": {},
}
photosdb = osxphotos.PhotosDB()
for uuid in UUID:
photo = photosdb.get_photo(uuid)
search = photo.search_info
search_norm = photo.search_info_normalized
data["UUID_SEARCH_INFO"][uuid] = search.asdict()
data["UUID_SEARCH_INFO_NORMALIZED"][uuid] = search_norm.asdict()
data["UUID_SEARCH_INFO_ALL"][uuid] = search.all
data["UUID_SEARCH_INFO_ALL_NORMALIZED"][uuid] = search_norm.all
print(json.dumps(data))